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Abstract 


Many  texture-segmentation  schemes  use  an  elaborate  bank  of  filters  to  decom¬ 
pose  a  textured  image  into  a  joint  space/spatial- frequency  representation.  VV'hile  these 
schemes  show  promise  and  some  analytical  work  has  been  done,  the  relationship  between 
texture  differences  and  the  filter  configurations  required  to  discriminate  them  remains 
largely  unknown.  This  thesis  examines  the  issue  of  designing  individual  filters.  Analysis 
based  on  mathematically  defined  texture  models  shows  that  applying  a  properly  con¬ 
figured  bandpass  filter  to  a  textured  image  produces  distinct  output  discontinuities  at 
texture  boundaries.  Depending  on  the  type  of  texture  difference  and  the  filter  parame¬ 
ters,  these  discontinuities  form  one  of  four  characteristic  signatures;  a  step,  valley,  ridge, 
or  a  step  change  in  average  local  output  variation.  Accompanying  experimental  evidence 
indicates  that  these  signatures  are  useful  for  segmenting  an  image.  Initially,  a  simple 
1-D  texture  model  is  used  to  derive  the  step  and  valley  signatures.  This  model  leads  to 
a  simple  analytical  development  providing  helpful  insight.  The  1-D  model,  however,  has 
certain  limitations.  For  example,  the  existence  of  the  ridge  signature  cannot  be  shown 
using  this  model.  Consequently,  a  more  general  2-D  model  is  also  presented,  leading 
to  a  more  complex  but  informative  analysis.  In  particular,  the  2-D  analysis  indicates 
those  texture  characteristics  that  are  responsible  for  each  signature  type  and  leads  to 
detailed  filter  design  criteria.  Even  the  2-D  analysis,  though,  makes  certain  simplifying 
assumptions  that  lead  to  inaccuracies  in  designing  filters  for  nonhomogeneous  textures. 
To  overcome  this  difficulty,  an  algorithm  was  developed  that  determines  the  “best”  filter 
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parameters  for  an  arbitrary  texture  pair.  The  algorithm  effectively  performs  an  exhaus¬ 
tive  (but  efficient)  search  of  the  filter  parameter  space  to  determine  the  filter  producing 
the  highest  quality  signature.  Signal  detection  theory  is  used  to  provide  a  measure  of 
signature  quality.  Although  the  analyses  presented  in  this  study  are  based  on  filters 
derived  from  Gabor  elementary  functions,  it  is  the  bandpass  nature  of  the  filter  that  is 
essential;  thus,  the  results  apply  to  bandpass  filters  in  general. 
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Chapter  1 


Thesis  Overview 


Texture  segmentation,  which  is  the  partitioning  of  an  image  into  homogeneous 
textured  regions,  continues  to  be  a  challenging  problem  in  computer  vision.  Classic  sta¬ 
tistical  and  structural  approachs,  while  applicable  to  many  computer  vision  problems, 
typically  focus  on  particular  image  attributes  that  characterize  the  textures  of  interest. 
Hence,  the  variety  of  textures  that  can  be  successfully  segmented  is  limited.  The  hu¬ 
man  visual  system,  on  the  other  hand,  can  segment  textures  robustly.  This  reaUzation 
has  motivated  researchers  in  the  fields  of  computer  vision,  psychophysics,  and  neuro¬ 
physiology  to  study  how  humans  perceive  textures  and  has  resulted  in  a  promising  new 
approach  to  texture  analysis. 

This  new  approach  is  based  on  the  concept  of  local  spatial  frequency.  Unlike 
classical  Fourier  analysis,  where  frequency  refers  to  sinusoids  of  infinite  extent,  the  new 
approach  views  frequency  as  a  local  phenomena  (a  local  frequency)  that  can  vary  with 
position  throughout  an  image.  Textures  are  characterized  by  their  local  spatial  frequency 
content.  Two  ‘■‘xtures,  then,  can  be  segmented  based  on  local-frequency  differences. 

One  popular  method  for  extracting  tnese  local  frequencies  is  to  apply  a  bank  of 
bandpass  filters  to  an  image.  This  results  in  a  collection  of  subimages,  where  each  subim¬ 
age  contains  a  limited  range  of  local  spatial  frequencies.  Motivation  for  this  approach 
comes  partly  from  psychophysical  and  neurophysiological  evidence  suggesting  that  the 
human  visual  system  might  be  performing  this  function.  Although  several  filter-bank 
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algorithms  have  shown  promising  results  and  some  analysis  has  been  done  previously,  the 
relationship  between  texture  differences  and  the  filter  configurations  required  to  discrim¬ 
inate  those  differences  remains  unknown.  Specifying  an  appropriate  filter  configuration 
involves  two  parts:  (1)  designing  individual  filters  and  (2)  specifying  filter  interactions. 
This  thesis  addresses  the  design  of  individual  filters  (portions  of  this  work  also  appear 
in  [1,  2,  3,  4,  5]). 

The  goal  is  to  design  filters  that  map  textural  differences  to  a  difference  in  average 
filter  output  so  that  simple  discontinuity  detectors  (e.g.,  an  edge  detector)  can  be  used 
to  segment  a  textured  image.  By  analyzing  filter  output  characteristics  as  a  function  of 
filter  parameters  and  textural  differences,  suitable  mappings  have  been  found  for  a  wide 
range  of  textures.  Details  of  the  filter  used  in  this  study  can  be  found  in  Chapter  3. 

Analysis  based  on  a  1-D  texture  model  shows  that  applying  properly  configured 
bandpass  filters  to  textured  images  produces  distinct  discontinuities  at  texture  bound¬ 
aries  (Chapter  4).  Depending  on  the  nature  of  the  texture  difference,  these  discontinuities 
exhibit  one  of  two  characteristic  signatures:  a  step  (Fig.  1.1a)  or  a  valley  (Fig.  1.1b). 
Experimental  evidence  indicates  that  these  signatures  are  useful  for  texture  segmenta¬ 
tion. 

Additional  insight  is  provided  in  Chapter  5  by  extending  the  analysis  to  2-D.  In 
2-D,  texture  is  modeled  as  collection  of  primitive  geometric  objects  called  texels.  A  homo¬ 
geneous  textured  region  consists  of  similar  texels,  and  texture  differences  are  induced  by 
varying  the  type  and/or  organization  of  the  texels.  For  convenience  two  levels  of  textural 
complexity  are  recognized:  uniform  and  nonuniform.  Uniformly  textured  regions  consist 
of  identical  texels  arranged  in  a  regular  lattice.  For  nonuniformly  textured  regions,  the 


Fig.  1.2.  Example  of  the  ridge  signature. 


texels  may  vary  in  orientation  and  their  shape  and  positions  may  be  perturbed. 

In  addition  to  the  step  and  valley  signatures  predicted  by  the  1-D  model,  the 
2-D  model  shows  that  a  ridge  signature  can  also  occur  (Fig.  1.2).  Analysis  based  on 
uniform  textures  shows  that  the  step  signature  occurs  when  two  textured  regions  differ 
in  constituent  texels  (Fig.  1.3a).  On  the  other  hand,  the  valley  and  ridge  signatures 
occur  when  two  regions  exhibit  a  texture-phase  difference  (Fig.  1.3b),  resulting  from 
spatial  shifts  between  regions.  The  analysis  also  provides  specific  guidelines  for  selecting 
filter  parameters  to  produce  quality  signatures  (Chapter  6).  In  particular,  the  conditions 
favoring  asymmetric  filters  are  revealed-an  issue  not  previously  addressed. 

For  nonuniform  textures,  a  detailed  analysis  is  impractical  due  to  their  complexity. 
Experimental  results  suggest,  however,  that  the  signatures  found  for  uniform  textures 
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Fig.  1.3.  Textured  regions  exhibiting  two  types  of  texture  difference: 

(a)  Texel  difference-regions  differ  in  constituent  texels. 

(b)  Texture-phase  difference-spatial  shift  between  regions. 
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occur  for  textures  in  general.  Due  to  the  texel  variation  in  nonuniform  textures,  though, 
these  signatures  can  exhibit  local  output  variations,  which  can  hinder  segmentation.  By 
judicious  selection  of  filter  parameters  coupled  with  post-filter  smoothing,  distinct  sig¬ 
natures  can  often  be  achieved,  and  the  image  can  be  easily  segmented.  Unfortunately, 
the  guidelines  for  selecting  filter  parameters,  which  were  developed  analytically  lor  uni¬ 
form  textures,  are  only  approximately  correct  for  nonuniform  textures.  To  overcome 
this  problem,  an  algorithm  is  developed  in  Chapter  7  to  find  the  “best”  filter  parameters 
for  any  given  texture  pair.  The  algorithm  has  been  applied  successfully  to  a  variety  of 
textures  including  synthetic,  natural,  uniform,  and  nonuniform. 

In  addition  to  the  three  signature  types  mentioned  earlier,  texel  variation  in 
nonuniform  textures  can  produce  a  fourth  signature  type,  which  is  a  step  change  in 
average  local  output  variation  (Fig.  1.4).  Although  this  signature  does  not  conform  to 
the  design  goal  mentioned  previously,  simple  post-filtering  operations  can  transform  this 
signature  into  a  step  signature. 

Chapter  8  presents  experimental  evidence  supporting  the  analyses.  Examples  are 
provided  demonstrating  the  signatures  mentioned  above  and  the  texture/filter  combina¬ 
tions  that  produced  them.  Chapter  9  provides  concluding  remarks. 


Chapter  2 


Introduction 

This  thesis  concerns  computational  methods  for  analyzing  texture.  But,  what 
exactly  is  texture?  The  dictionary  describes  texture  as  “the  visual  and  especially  tac¬ 
tile  quality  of  a  surface”  [6].  Some  examples  that  come  to  mind  are  a  grass  lawn,  a 
sandy  beach,  and  a  woven  fabric.  The  dictionary  goes  on  to  characterize  texture  as  the 
“. .  .physical  structure  given  to  an  object  by  the  size,  shape,  arrangement,  and  propor¬ 
tions  of  its  parts”  [6].  Referring  to  the  previous  examples,  this  structure  is  formed  by 
the  blades  of  grass,  the  particles  of  sand,  and  the  weave  in  the  fabric.  Although  textured 
surfaces  are  inherently  three  dimensional,  the  emphasis  of  my  research  is  on  monocular 
vision.  Thus,  subsequent  discussion  and  analyses  are  limited  to  the  planar  projections  of 
textures  called  textural  images.  Also,  the  structural  properties  of  texture  are  of  primary 
interest;  so,  color  and  average  intensity  differences  between  textures  are  ignored. 

The  analysis  of  textured  images  can  be  divided  into  four  categories:  discrimina¬ 
tion,  segmentation,  classification,  and  shape  from  texture.  In  texture  discrimination,  the 
goal  is  to  determine  whether  or  not  a  texture  difference  exists  between  two  regions  of 
an  image.  Referring  to  Fig.  2.1,  for  example,  the  task  might  be  to  determine  if  region  I 
differs  from  region  II.  Texture  segmentation  involves  partitioning  an  image  into  regions 
of  homogeneous  texture.  Three  such  partitions  are  shown  in  Fig.  2.1.  The  difference  be¬ 
tween  segmentation  and  discrimination  is  that  segmentation  determines  the  boundaries 
between  textured  regions,  whereas  in  discrimination,  the  regions  are  known  a  priori] 
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thus,  segmentation  is  more  difficult.  Once  an  image  is  properly  segmented  into  regions, 
texture  classification  can  be  used  to  identify  each  region  by  type;  e.g.,  as  grass,  sand,  or 
fabric.  Texture  analysis  can  also  provide  clues  to  the  shape  of  objects.  A  simple  example 
of  shape  from  texture  is  the  use  of  texture  gradients  for  determining  surface  orientation 
[7].  For  example,  a  surface  that  is  oblique  to  the  viewing  plane  produces  image  structure 
that  changes  scale  with  image  coordinates  (see  Fig.  2.2).  Measuring  this  change  in  scale 
can  provide  an  estimate  of  surface  orientation.  As  with  classification,  an  image  must 
first  be  segmented  before  texture  gradients  can  be  computed.  My  research  is  primarily 
concerned  with  techniques  for  texture  discrimination  and  segmentation.  And,  though 
the  methods  developed  in  this  thesis  can  be  extended  to  solve  classification  problems, 
the  study  of  texture  classification  and  shape  from  texture  is  beyond  the  scope  of  this 
work. 

A  major  problem  in  developing  robust  methods  for  texture  analysis  is  the  lack  of 
a  precise  definition  for  texture.  Although  an  intuitive  description  of  texture  was  given 
earlier,  it  is  far  from  a  comprehensive  definition.  Texture,  it  seems,  is  one  of  those  terms 
that  defies  mathematical  definition. 

To  illustrate  the  difficulty  in  defining  texture,  consider  how  humans  segment 
the  following  textures.  (These  examples,  which  are  commonly  referred  to  as  synthetic 
textures,  typify  those  developed  by  researchers  to  exhibit  specific  textural  properties 
[8,  9,  10,  11].)  Fig.  2.3a  consists  of  a  region  of  Rs  and  another  of  mirror-image  Rs.  Note 
that,  although  it  is  easy  to  distinguish  between  a  single  R  and  its  mirror  image,  consid¬ 
erable  effort  is  required  to  determine  the  region  boundary  in  this  figure.  Evidently,  the 
random  orientations  mask  the  differences  between  texels.  This  is  not  always  the  case,  as 
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Fig.  2.2.  Surface  orientation  perceived  due  to  a  texture  gradient  (from 
Blostein  and  Ahuja  [7]). 
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Fig.  2.3.  Textures  with  randomly  oriented  texels: 

(a)  Pair  of  textures  consisting  of  Rs  on  the  left  and  mirror-image  Rs  on  the 
right.  Texture  pair  not  easily  distinguishable. 

(b)  Pair  of  textures  consisting  of  Rs  on  the  left  and  Ts  on  the  right.  Texture 
pair  easily  distinguishable. 
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demonstrated  in  Fig.  2.3b.  Here,  the  Rs  are  ea.sily  distinguished  from  the  Ts  in  isolation 
or  when  randomly  oriented  in  a  texture.  In  many  ca.ses.  differences  in  orientation  can 
cause  segmentation.  For  example,  the  center  region  in  Fig.  2.4  is  easily  distinguishable 
from  the  background  even  though  the  lines  in  the  two  regions  differ  only  slightly  in  aver¬ 
age  orientation  [8].  Finally,  consider  Fig.  2.5.  The  bottom  region  consists  of  alternating 
columns  of  Us  and  inverted  Us,  whereas  the  top  consists  of  alternating  rows.  For  most 
observers,  however,  it  is  the  5  horizontal  black  bars  (so  called  "emergent”  features)  that 
seem  to  attract  attention  [9].  Note  that  the  actual  l)oundary  between  regions  goes  un¬ 
noticed.  Thus,  in  this  case,  the  difference  in  texture  is  not  even  the  dominant  feature.  It 
is  important  to  point  out  that  using  human  performance  as  an  indicator  of  textural  dif¬ 
ferences  is  not  the  only  alternative.  For  instance,  a  particular  application  might  require 
distinguishing  Rs  from  mirror-image  Rs.  Thus,  what  constitutes  a  textural  difference 
can  depend  on  the  application.  As  the.se  examples  demonstrate,  textural  differences  can 
be  difficult  to  characterize. 

In  the  absence  of  a  precise  definition  for  texture,  researchers  have  resorted  to 
more  qualitative  descriptions.  Rao  has  proposed  that  textures  can  be  grouped  into  four 
classes:  strongly  ordered,  weakly  ordered,  disordered,  and  compositional  [12].  Fig.  2.6 
shows  examples  of  naturally  occurring  textures  (from  Brodatz  [13])  illustrating  Rao's 
taxonomy.  Fig.  2.6a  is  “cotton  canvas"-an  example  of  a  strongly  ordered  texture.  This 
class  is  characterized  by  an  arrangement  of  primitive  geometric  shapes  called  texels. 
Fig.  2.6b  is  “straw”.  In  this  example,  the  characteristic  feature  is  a  globally  oriented 
structure.  Textures  with  this  property  are  called  weakly  ordered.  The  third  example 
(Fig.  2.6c)  is  “grass  lawn”-a  disordered  texture.  This  class  exhibits  no  obvious  pattern 
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Fig.  2.4.  Segmentation  due  to  a  difference  in  the  average  orientation  of  line 
segments  (from  Nothdurft  [8]). 
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Fig.  2.5.  Alternating  rows  of  Us  and  inverted  Us  form  black  bars  as  “emer¬ 
gent”  features.  Boundary  between  regions  of  alternating  rows  and  alternating 
columns  is  not  the  dominant  feature  (from  Beck  et  al.  [9]). 
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Fig.  2.6.  Natural  textures  demonstrating  Rao’s  taxonomy  (from  Brodatz 
[13]): 

(a)  “cotton  canvas”-strongly  ordered; 

(b)  “straw”-weakly  ordered; 

(c)  “grass  lawn”-disordered; 

(d)  “lace”-compositional. 
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of  texels  nor  a  dominant  orientation,  rather  the  texture  seems  to  be  described  best  by 
the  statistical  distribution  of  image  pixels.  Compositional  textures,  such  as  the  example 
in  Fig.  2.6d,  are  just  combinations  of  the  other  three  classes. 

Classic  texture-analysis  techniques  tend  to  be  divided  along  these  class  bound¬ 
aries.  For  example,  statistical  methods  [14,  15.  16,  17],  random  field  models  [10,  14, 
18,  19,  20],  and  fractals  [21,  22]  have  been  used  to  model  disordered  textures,  while 
collections  of  geometric  primitives  [23.  24.  25]  and  mosaic  models  [26]  have  been  used 
for  strongly-ordered  textures.  Rao  recommends  modeling  weakly-ordered  textures  using 
orientation  fields  [12].  Little  has  been  done  with  compositional  textures. 

While  these  methods  can  be  effective  for  textures  within  a  particular  class,  perfor¬ 
mance  is  typically  poor  outside  the  class.  The  human  visual  system,  on  the  other  hand, 
can  analyze  textures  robustly.  This  realization  has  motivated  researchers  in  the  fields 
of  computer  vision,  psychophysics,  and  neurophysiology  to  study  how  humans  perceive 
textures  in  an  effort  to  develop  more  robust  machine- vision  texture-analysis  schemes. 

Early  insights  into  human  te.xture-perception  mechanisms  were  provided  by  psy¬ 
chophysical  experiments  [10].  These  e.xperimeuts  test  human  response  to  carefully  con¬ 
trolled  stimuli.  By  controlling  stimulus  properties,  researchers  attempt  to  deprive  the 
visual  system  of  familiar  cues  and  to  force  it  to  rely  on  primitive  mechanisms.  Experi¬ 
ments  such  as  these  have  shown  that  te.xture  segmentation  is  a  spontaneous  process  not 
involving  conscious  comparisons,  suggesting  that  texture  perception  differs  from  form 
perception  [27,  28].  To  illustrate  this  point,  consider  the  e.xamples  in  Figs.  2.7  and  2.8. 
In  Fig.  2.7,  it  is  easy  to  recognize  the  two  different  regions;  however,  in  Fig.  2.8,  consid¬ 
erable  effort  is  required  to  discover  that  the  three  columns  on  the  left  are  words,  and  the 
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three  columns  on  the  right  are  nonsense  words  [29].  These  dramatic  differences  in  vi¬ 
sual  performance  prompted  Neisser  to  propose  that  the  human  visual  system  consists  of 
two  operating  modes,  a  preattentive  mode  and  an  attentive  mode  [30].  The  preattentive 
mode  processes  a  large  portion  of  the  visual  field  quickly  (presumably  in  parallel)  but 
imprecisely.  In  contrast,  the  attentive  mode  operates  within  a  much  smaller  aperture 
and  at  a  much  slower  rate  but  can  perform  more  detailed  image  analysis.  The  rapid 
discrimination  and  segmentation  of  textures  is  attributed  to  the  preattentive  mode  and 
is  often  termed  preattentive  texture  discrimination  (or  segmentation).  The  perception  of 
form,  on  the  other  hand,  requires  scrutiny,  suggesting  that  the  attentive  mode  is  needed 
for  this  task. 

The  notion  of  a  fast,  parallel,  imprecise  system  is  the  foundation  of  many  modern 
texture-analysis  models.  Section  2.1  describes  early  efforts  to  model  human  preattentive 
texture  discrimination.  These  methods  attempt  to  characterize  texture  differences  by 
the  statistical  properties  of  image  pixels  or  by  differences  in  geometric  features  such  as 
edges,  lines,  and  blobs.  Section  2.2  introduces  a  more  recent  approach  that  involves 
detecting  local  spatial-frequency  differences  between  textured  regions.  It  dso  describes 
the  application  of  the  local  spatial-frequency  approach  to  texture  analysis  and  discusses 
open  questions  leading  to  this  thesis. 

2.1  Early  Texture  Perception  Models 

The  first  attempts  to  model  human  texture  perception  were  made  by  Julesz  in  the 
early  1960’s  [10].  He  set  out  to  determine  if  texture  discrimination  could  be  described 
by  the  statistical  properties  of  textures  alone,  or  whether  it  was  necessary  to  consider 
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Fig.  2.7.  A  preattentively  segmentable  image.  Regions  differ  in  dot  density 
(from  Julesz  et  al.  [29]). 
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Fig.  2.8.  An  image  not  preattentively  segmentable.  Columns  of  words  and 
nonsense  words  (from  Julesz  et  al.  [29]). 
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the  internal  operation  of  the  visual  system.  He  proceeded  to  generate  binary  (black 
amd  white)  texture  pairs  whose  pixels  had  predetermined  n^'^-order  joint  probability 
distributions.'  Each  textured  region  consisted  of  a  collection  of  micropatterns,  called 
texture  elements  (texels),  either  regularly  spaced  or  thrown  at  random  (Fig.  2.9).  Julesz 
observed  that  most  textures  that  have  different  first  or  second-order  distributions  are 
easily  discriminated,  whereas  textures  that  differ  only  in  higher  order  distributions  are 
not  discriminable.  Counterexamples,  however,  were  found  [IS,  29],  which  lead  Gagalow- 
icz  to  propose  an  alternative  explanation.  He  observed  that  due  to  inhomogeneities  in 
constructing  stochastic  textures,  local  statistics  differed  greatly  from  the  global  statis¬ 
tics  [18].  He  proposed  that  humans  discriminate  textures  based  on  local  computations, 
and  that  humans  cannot  discriminate  textures  that  have  the  same  local  second-order 
statistics. 

At  about  the  same  time,  Julesz  proposed  a  more  localized  model  with  his  texton 
theory  [23,  24].  He  found  that  many  textures  that  have  different  first  or  second-order 
distributions  also  differ  in  some  local  features,  which  Julesz  called  textons.  He  proposed 
that  preattentive  texture  discrimination  is  due  either  to  differences  in  texton  type  or 
to  differences  in  the  number  of  te.xtons.  Features  considered  to  be  textons  included 
color,  elongated  blobs  (with  some  orientation,  width,  and  length),  terminations,  and 
crossings  (i.e.,  points  where  lines  intersect)  [31],  Note  that  under  this  theory,  the  relative 
positions  of  textons  is  unimportant  for  texture  discrimination  -  only  the  number  or  type 
is  significant.  Therefore,  configurational  differences  are  explicitly  ignored.  Fig.  2.10  is 

'a  first-order  probability  distribution  refers  to  a  probability  distribution  with  one  random  variable,  a 
second-order  joint  distribution  refers  to  a  probability  distribution  involving  two  random  variables  (pairs 
of  pixels),  etc. 
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Fig.  2.9.  Texture  pair  consisting  of  randomly  ‘^thrown”  micropatterns.  Re¬ 
gions  preattentively  discriminable  (from  Julesz  [24]). 
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a  texture  considered  to  be  discriminable  due  to  a  difference  in  the  number  of  textons 
(terminations)  [31].  Discrimination  fails,  though,  in  Fig.  2.11  due  to  a  lack  of  any  texton 
difference  [31]. 

Texton  theory,  however,  has  certain  limitations.  First,  textons  are  defined  only 
verbally  (e.g.,  elongated  blob),  and  thus  their  characteristics  must  be  inferred  from  ex¬ 
amples  [11,  32].  Since  textons  are  only  vaguely  defined,  it  seems  one  must  solve  the 
pattern-recognition  problem  to  recognize  a  texton.  Second,  there  is  some  doubt  that 
terminators  and  crossings  are  textons  [11,  28,  33].  Third,  the  rejection  of  possible  con¬ 
figurational  effects  seems  inconsistent  with  the  results  of  others  [9,  34]. 

An  alternative  to  texton  theory  was  proposed  by  Beck  et  al.  [9].  They  observed 
that  texture  discrimination  occurs  easily  when  textured  regions  differ  in  the  slopes,  sizes, 
colors,  and  brightness  of  the  texels  or  their  component  parts.  They  also  observed  that 
differences  in  texel  configuration  can  affect  discrimination  (Fig.  2.12).  Based  on  these 
observations,  they  proposed  (similar  to  Julesz’s  texton  theory)  that  texture  discrimina¬ 
tion  is  based  on  first-order  differences  (i.e.,  differences  in  distribution)  in  image  features. 
However,  unlike  Julesz,  they  suggested  that  these  features  are  not  computed  directly 
from  the  retinal  array.  Instead,  only  a  few  simple  features  are  detected  directly.  These 
features  are  then  “linked”  into  higher  order  texels  based  on  the  Gestalt  heuristics  of 
proximity,  similarity,  and  good  continuation  [35].  Texture  discrimination  is  the  result 
of  feature  differences  between  texels.  Beck  et  al.  do  not  define  their  primitive  features 
explicitly.  Rather,  they  describe  them  as  those  objects  that  best  stimulate  the  simple 
cells  in  the  visual  cortex  (e.g.,  the  edges  and  bars  proposed  by  Hubei  and  Wiesel  [36]). 
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Fig.  2.10.  Texture  pair  whose  texels  differ  in  the  number  of  textons  (termi* 
nations).  Regions  preattentively  dtscriminable  (from  Julesz  [31]). 
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Fig.  2.11.  Texture  pair  whose  texeb  have  the  same  number  of  textons. 
Regions  not  preattentively  discriminable  (from  Julesz  [31]). 
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Fig.  2.12.  Region  discrimination  due  to  configurational  differences.  Center 
region  consists  of  colinear  line  segments  (from  Beck  et  ai  [9]). 
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Similar  to  ti  linking  idea  of  Beck  et  ai.  Marr  and  Hildreth  suggested  that  com¬ 
plex  objects  are  formed  by  hierarchical  grouping,  starting  with  primitive  features  [37]. 
Marr  and  Hildreth,  however,  proposed  a  different  set  of  primitive  features.  They  sug¬ 
gested  that  the  function  of  simple  cells  is  to  locate  intensity  gradients  rather  than  edges  or 
bars.  They  introduced  the  Laplacian-of-Gaussian  operator  (s^^G),  which  approximates 
the  receptive  field  profiles  of  simple  cells,  and  when  convolved  with  an  image,  indicates 
the  location  of  intensity  gradients  over  a  selected  range  of  scales  [25,  37].  The  effect  of  the 
Gaussian  is  to  lowpass  filter  the  image,  thus,  eliminating  fine-grain  intensity  changes. 
Applying  the  Laplacian  to  this  filtered  image  results  in  zero  values  at  the  location  of 
maximum  intensity  gradients.  The  positions  where  these  zero  values  occur  (called  zero 
crossings)  are  used  to  form  the  primitive  features  in  the  Marr-Hildreth  model.  These 
zero  crossings  are  then  grouped  into  abstract  objects  called  place  tokens.  Place  tokens 
represent  edges,  bars,  blobs,  and  terminators  with  properties  such  as  orientation,  con¬ 
trast,  length,  width,  and  position.  These  tokens  are  computed  over  a  range  of  scales  and 
can  themselves  be  grouped  to  form  larger  and  larger  tokens.  Differences  in  the  properties 
or  configurations  of  tokens  becomes  the  basis  for  texture  discrimination. 

Both  Beck  et  al.  and  Marr  and  Hildreth  attribute  texture  discrimination  to  dif¬ 
ferences  in  the  collective  properties  of  higher  order  objects  (texels  or  place  tokens).  The 
idea  of  comparing  collections  of  properties  (a  kind  of  feature  vector  correlation),  how¬ 
ever,  is  inconsistent  with  certain  psychophysical  findings.  For  example,  Treisman  found 
that  certain  combinations  of  otherwise  discriminal)le  features  do  not  produce  texture 
discrimination  [28].  By  way  of  illustration,  consider  the  images  in  Fig.  2.13.  The  the¬ 
ories  of  Beck  et  al.  or  Marr  and  Hildreth  would  form  three  regions  in  each  frame.  In 
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Fig.  2.13.  Images  consisting  of  three  regions  (from  Treisman  [28]).  Region 
differences: 

(a)  Texel  shape. 

(b)  Shading. 

(c)  Conjunction  of  features. 
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Fig.  2.13a  discrimination  would  be  based  on  differences  in  texel  shape.  In  Fig.  2.13b, 
the  difference  is  due  to  shading,  and  in  Fig.  2.13c.  the  dark  circles  would  form  one 
region,  the  open  circles  another,  and  the  open  triangles  a  third;  however,  if  Fig.  2.13c 
is  viewed  quickly,  only  two  regions  are  perceived.  From  this  and  other  psychophysical 
experiments,  Treisman  concluded  that  the  conjunction  of  features  (e.g.,  a  red,  vertical, 
blob  at  position  (i,y))  is  not  available  to  the  preattentive  system.  Rather  each  feature 
has  its  own  separate  representation  in  space  (a  feature  map).  In  this  way,  the  position  of 
something  red  can  be  detected,  and  the  position  of  a  blob  can  be  detected,  but  the  cor¬ 
respondence  between  red  and  blob  is  not  iepresente<!  directly.  Only  by  attentive  search 
can  this  correspondence  be  resolved.  See  Fig.  2.14  for  a  diagram  of  this  model. 

Iwama  and  Maida,  using  edge  segments  and  terminations  as  primitive  features, 
have  developed  a  texture-segmentation  architec'ure  that  combines  the  token  idea  of  Marr 
and  the  concept  of  feature  integration  [.38].  Their  model  pciformo  well  over  a  range  of 
textures,  and  in  particular,  has  the  unusual  property  of  being  able  to  represent  overlap¬ 
ping  textures  (Fig.  2.15).  A  careful  e.xamination  of  the  implementation  details,  however, 
reveals  the  following  limitations.  First,  selecting  geometric  objects  as  the  primitive  fea¬ 
ture  set  makes  it  difficult  to  develop  robust  feature  detectors  (primarily  because  the 
features  are  not  defined  mathematically).  Second,  tlie  criteria  for  grouping  is  largely 
unspecified.  This  requires  the  specification  of  many  empirical  parameters,  making  it 
difficult  to  predict  performance  on  untested  images.  These  problems  are  not  unique  to 
Iwama  and  Maida,  but  are  inherent  to  models  based  on  semantic  descriptions. 
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RETINAL  IMAGE 


Fig.  2.14.  Treisman’s  feature  integration  model  (from  Treisman  [28]). 


00 

OOO 

O  GO  O  /  //' 

OO  CO  //  // 

OCOO  //■  // 
OOCO/V 
O'  fjO'O 

// oobo 

//OOO  o 


</  // 

//  / 

/  / 


m  o 
GDO  O 
OOGD 
CO  OO 
O  OO 


Fig.  2.15.  Overlapping  texture  pair  (from  Iwama  et  al.  [38]). 
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2.2  The  Local  Spatial-Frequency  Approach 

With  the  exception  of  Marr,  the  texture-perception  theories  mentioned  thus  far 
consider  the  primitive  features  to  be  geometric  objects  such  as  edges  and  bars.  This  idea 
was  motivated  largely  by  the  findings  of  Hubei  and  Wiesel,  who  found  that  simple  cells 
in  the  visual  cortex  responded  better  to  these  objects  than  to  diffuse  light  [36].  More 
recently,  however,  an  alternate  interpretation  of  cortical  cell  function  was  proposed  -  that 
of  a  spatial-frequency  analyzer  [39,  40,  41,  42,  43,  44,  45,  46]. 

While  measuring  human  contrast  sensitivity  to  simple  functions  (sine  waves, 
square  waves,  etc.),  Campbell  and  Robson  discovered  that  the  response  to  these  func¬ 
tions  could  be  predicted  from  the  frequency  components  of  the  waveform  under  test 
[39].  They  proposed  that  the  visual  system  behaves  as  a  number  of  independent  detec¬ 
tor  mechanisms  each  preceded  by  a  narrow-band  filter  tuned  to  a  different  frequency. 
They  suggested  that  each  filter/detector  pair  constituted  a  separate  channel,  and  each 
channel  would  have  its  own  contrast  sensitivity  function  (i.e.,  bandpass  characteristics). 
Subsequently,  neurophysiological  evidence  appeared  suggesting  that  simple  cells  respond 
better  to  the  frequency  components  of  a  stimulus  than  to  its  geometric  features  [40,  41]. 
The  initial  tendency  was  to  propose  that  the  visual  system  was  computing  the  Fourier 
transform  of  the  image.  This  idea  was  dispelled  by  Julesz  and  Caelli  who  pointed  out 
that  the  Fourier  transform  is  a  global  operation,  and  thus  is  not  capable  of  representing 
local  intensity  variations  explicitly  [47]. 

The  abundance  of  apparently  conflicting  evidence  prompted  a  debate  as  to  the 
functionality  of  simple  cells:  Are  they  feature  detectors  or  spatial-f'equency  analyzers 
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[48]?  Viewing  simple  cells  as  feature  detectors  implies  that  the  representation  of  an 
image  in  the  visual  cortex  is  in  the  spatial  domain,  whereas  viewing  them  as  frequency 
analyzers  suggests  a  spatial-frequency  domain  representation.  Experimental  work  by 
Maffei  et  ai,  however,  suggests  that  the  two  views  are  not  necessarily  contradictory 
[45].  Their  results  indicate  that  both  interpretations  of  simple  cell  function  can  be 
phenomenologically  correct.  This  apparent  paradox  can  be  explained  using  the  concept 
of  local  frequency. 

The  concept  of  local  frequency  was  developed  by  Gabor  [49]  for  1-D  signals  and 
extended  to  images  by  Daugman  [50].  Their  work  shows  that  the  spatial  representation 
and  the  spatial-frequency  representation  are  just  opposite  extremes  of  a  continuum  of 
possible  joint  space/spatial-frequency  representations.  In  a  joint  space/spatial-frequency 
representation,  frequency  can  be  viewed  as  a  local  phenomena  (i.e.,  a  local  frequency) 
that  can  vary  with  position  throughout  the  image.  These  local  frequencies  arise  due  to 
local  interactions  among  groups  of  sinusoids.  These  sinusoids,  which  diflfer  in  frequency 
and  phase,  constructively  interfere  with  each  other  to  produce  spatially  localized  con¬ 
centrations  of  signal  energy.  It  is  this  localized  signal  energy  that  forms  the  intensity 
patterns  in  an  image. 

Marcelja  demonstrated  the  plausibility  of  a  joint  space/spatial-frequency  repre¬ 
sentation  in  the  human  visual  system  by  showing  that  the  functions  described  by  Gabor 
closely  approximate  the  receptive  field  profiles  of  simple  cells.  Subsequently,  additional 
neurophysiological  support  appeared  [42,  44,  51,  52]. 


The  evidence  suggesting  that  early  visual  mechanisms  are  performing  local  spatial- 
frequency  analysis  spawned  a  number  of  texture-discrimination  (and  segmentation)  mod¬ 
els  [53,  54,  55, 56,  57,  58,  59,  60].  One  characteristic  that  distinguishes  these  models  from 
feature-based  models  is  that  textural  differences  are  not  viewed  as  resulting  from  geomet¬ 
ric  feature  differences.  Rather,  local  spatial-frequency  models  assume  that  perceptually 
significant  textural  differences  correspond  to  differences  in  local  spatial-frequency  con¬ 
tent.  By  decomposing  an  image  into  a  joint  space/spatial-frequency  representation,  the 
distribution  of  local  spatial  frequencies  within  the  image  can  be  determined.  This  dis¬ 
tribution  of  frequencies,  then,  becomes  the  basis  for  texture  analysis. 

The  representation  of  an  image  in  either  the  spatial  domain  or  the  spatial-frequency 
domain  is  unique.  For  any  image  there  is  only  one  pi.xel  array  or  one  Fourier  transform. 
In  the  joint  space/spatial-frequency  domain,  however,  an  infinite  number  of  representa¬ 
tions  are  possible,  and  several  techniques  are  available  for  performing  the  decomposition. 
One  method  for  decomposing  an  image  into  a  joint  space/spatial-frequency  representa¬ 
tion  is  the  windowed  Fourier  transform.  To  compute  the  continuous  windowed  Fourier 
transform,  the  following  equation  is  evaluated  (shown  in  1-D  for  simplicity). 

h(x,f)=  I  sr(r)u'*(c  -  (2,1) 

Here  w‘  is  the  complex  conjugate  of  the  window  function  w.  g  is  the  function  to  be 
transformed,  and  h{x,f)  is  the  joint-domain  representation  of  the  (1-D)  “image.”  The 
special  case  when  the  window  function  is  a  Gau.ssian  is  called  the  Gabor  transform 
[49,  61).  Note  that  the  windowed  Fourier  transform  is  similar  to  the  classic  Fourier 
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transform  except  that  the  input  is  multiplied  by  a  window  function,  whose  position  is 
parameterized.  In  effect,  (2.1)  computes  the  Fourier  transform  of  a  subset  of  the  original 
image  -  hence  the  term  local  frequency. 

Although  (2.1)  can  be  viewed  as  computing  a  local  Fourier  transform,  there  is 
another  useful  interpretation.  Note  that 

w’(z  -  a:)e-‘^ 

in  (2.1)  is  a  modulated  window  function  and  therefore  has  bandpass  characteristics. 
Thus,  (2.1)  can  be  interpreted  as  filtering  the  image  g  with  a  bandpass  filter,  where 
the  center  frequency  of  the  filter  is  /  and  its  bandwidth  is  determined  by  the  window 
function. 

The  application  of  bandpass  filters  to  images  is  an  integral  part  of  many  texture- 
analysis  schemes,  including  those  based  on  Gabor  elementary  functions  [53,  55],  wavelet 
transforms  [62,  63],  derivatives  of  Gaussians  (Hermite  polynomials)  [64],  and  differences 
of  offset  Gaussians  [56,  64].  Although  methods  differ  in  the  bandpass  characteristics  of 
the  individual  filters  and  how  the  filters  are  distributed  over  the  frequency  domain,  they 
can  be  collectively  referred  to  as  filter-bank  models.  A  schematic  of  a  typical  filter-bank 
architecture  is  shown  in  Fig.  2.16.  While  the  filter-bank  paradigm  has  shown  potential 
and  some  analytical  work  has  been  done  to  demonstrate  the  efficacy  of  certain  types  of 
filters,  the  relationship  between  texture  differences  and  the  filter  configurations  required 
to  discriminate  them  remains  largely  unknown.  Two  major  issues  arise:  (1)  the  design  of 
individual  filters,  and  (2)  the  configuration  of  the  filter  bank.  An  adequate  understanding 
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of  how  to  design  an  individual  filter  seems  essential  for  understanding  how  to  build  a 
suitable  filter  bank.  Therefore  my  thesis  addresses  the  issue  of  filter  design. 

The  following  chapters  provide  a  detailed  analysis  of  filter  design.  The  analysis 
assumes  that  the  filter  is  based  on  a  Gabor  elementary  function,  which  is  a  Gaussian 
modulated  by  a  complex  sinusoid  (i.e.,  it  assumes  a  Gaussian  window  function  in  (2.1)). 
The  analysis  shows,  however,  that  it  is  the  bandpass  characteristic  of  a  filter  function 
that  is  essential.  Thus,  other  filter  functions  could  conceivably  be  used. 


Chapter  3 


Defining  the  Filter 


Subsequent  analyses  of  textured  images  assumes  the  following  filter  structure; 

m{x,  y)  =  Gf(i{x,  y))  =  |t(x,  y)  *  h(x,  2/)|  (3.1) 

where  *  denotes  convolution,  t  is  an  image,  h  is  &  Gabor  elementary  function  (GEF), 

and  m  is  the  filter  output.  The  filtering  operator  G/  in  (3.1)  will  be  called  a  Gabor  filter. 
The  form  of  the  Gabor  filter  is  justified  below. 

GEFs  possess  three  desirable  properties  for  texture  analysis: 

•  The  GEFs  are  the  only  functions  that  achieve  the  lower  bound  of  the  space- 

bandwidth  product  as  specified  by  the  uncertainty  principle  [65].  This  means 

that  they  can  simultaneously  be  optimally  localized  in  both  the  spatial  and  spatial- 
frequency  domains.  Thus,  GEFs  can  be  designed  to  be  highly  selective  in  frequency, 
while  displaying  good  spatial  localization. 

•  The  shapes  of  GEFs  resemble  the  receptive  field  profiles  of  the  simple  cells  in  the 
visual  pathway  [44,  46]. 

•  They  are  bandpass  filters.  Thus,  GEFs  can  be  configured  to  extract  a  specific  band 
of  frequency  components  from  an  image. 
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CEFs  were  first  defined  by  Gabor  [49]  and  later  extended  to  2-D  by  Daugman 
[50].  (A  few  researchers  have  referred  to  GEFs  as  Gabor  wavelets  [62,  63].)  A  GEF  is  a 
Gaussian  modulated  by  a  complex  sinusoid  [49,  50,  53] 

Kx,y)  =  gix'y*)  exp[;(f/i  +  Vy)]  (3.2) 


where  (x',  y')  =  {x  cos  d  +  y  sin  6,  —x  sin  ^  +  y  cos  6)  are  rotated  spatial-domain  rectilinear 
coordinates,  (u,  u)  are  frequency-domain  rectilinear  coordinates,  and  (I/,  V)  give  the  par¬ 
ticular  2-D  frequency  of  the  complex  sinusoid.  (f>  =  tan“^(V/it/)  specifies  the  orientation 
of  the  sinusoid,  g{x,y)  is  the  2-D  Gaussian 


gix,y)  = 


(3.3) 


and  {<Tj;,(Ty)  characterize  the  spatial  extent  and  bandwidth  of  h.  The  aspect  ratio  of 
y(i,  y)  is  given  by  A  =  <Ty/<Tx  and  gives  a  measure  of  the  filter’s  asymmetry.  An  example 
of  the  real  part  of  a  GEF  is  shown  in  Fig.  3.1.  The  Fourier  transform  of  h  is 

H{u,v)  =  exp  |-^[(t^x[«  -  U]y  +  (cry[r  -  V"]')^]|  (3.4) 


where  [(«  —  t/)',(r-  V)']  =  [(u-  (7)cos^4-(w-  V^)sin^,  -(«-£/)  sin  -I- (r-  V)  cosd]  are 
shifted  and  rotated  frequency  coordinates.  H{a,v)  is  a  Gaussian  that  is  shifted  iU,V) 
frequency  units  along  the  frequency  axes  (u,  t?)  and  rotated  by  an  angle  0  relative  to  the 
positive  u  axis.  Thus,  H  acts  as  a  bandpass  filter  with  center  frequency  (f/,  V)  [relative 
to  (u,r)]  and  a  bandwidth  controlled  by  and  Oy.  Note  that  when  the  aspect  ratio 
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Fig.  3.1.  Example  of  the  real  part  of  a  GEF. 
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H{u.  c) 


Fig.  3.2.  Schematic  of  the  frequency  domain  representation  of  a  GEF. 

A  of  g(x,  y)  differs  from  unity,  the  Gaussian  is  asymmetric  with  an  orientation  6  that 
generally  differs  from  the  orientation  ^  of  the  complex  sinusoid.  A  schematic  of  the 
frequency  domain  representation  of  a  GEF  is  shown  in  Fig.  3.2.  When  the  Gaussian 
is  circularly  symmetric  (i.e.,  a,  =  <Ty  =  <7),  {U,V)  in  (3.2)  can  be  expressed  in  polar 
coordinates.  Then, 

=  (3.5) 

where 


SI  =  VipTv^ 


(3.6) 
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and  x'  =  X  cos  4>  +  ysm  </>.  The  corresponding  Fourier  transform  of  h{x,  y)  is 

H{u,  v)  =  exp{-(T^ /2[{u'  -  +  (u')^]} 

where  {u',v')  =  («cos<^  +  vsind>,  — usin^  +  »cos^)  are  rotated  frequency  coordinates. 
H{u,v)  is  a  Gaussian  that  is  shifted  radially  fl  frequency  units  at  an  angle  (j>  relative 
to  the  positive  u  axis.  Thus,  H  acts  as  a  bandpass  filter  with  center  frequency  (J2,0) 
[relative  to  (u',  u')]  and  a  bandwidth  controlled  by  a. 

The  analysis  in  Chapter  5  will  show  that  it  is  the  bandpass  nature  of  the  GEF  that 
is  most  essential  for  discriminating  textural  differences.  Hence,  since  the  aforementioned 
possibilities  for  filter  functions  -  wavelet  bases  (62,  63],  the  difference  of  offset  Gaussians 
[56,  64],  and  Gaussian  derivatives  [64]  -  also  share  this  property,  the  choice  of  the  GEF 
is  not  restrictive.  Within  the  context  of  modeling  human  texture  perception,  Malik  and 
Perona  mentioned  that  the  exact  choice  of  a  filter  function  was  unimportant,  and  they 
chose  the  difference  of  offset  Gaussians  for  computational  simplicity  and  physiological 
plausibility  [56].  Also,  Bovik  et  al.  have  discussed  the  efficacy  of  bandpass  filters  for 
texture  segmentation  [53,  66]. 

The  magnitude  operation  used  in  the  Gabor  filter  (3.1)  will  now  be  discussed. 
Julesz  has  shown  that  purely  linear  mechanisms  are  inadequate  to  explain  how  humans 
perceive  texture  [67].  This  point  was  further  asserted  by  Malik  and  Perona  [56]  and  can 
be  illustrated  with  Fig.  3.3.  Fig.  3.3a  shows  a  uniformly  textured  image  that  is  easily 
segmented  by  humans.  If  the  homogeneous  texture  in  Fig.  3.3b  is  added  to  Fig.  3.3a, 
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Fig.  3.3.  Texture  sequence  demonstrating  the  need  for  a  nonlinearity  (from 
Malik  and  Perona  [56]): 

(a)  Uniform  texture  pair-easily  segmented. 

(b)  Homogeneous  texture. 

(c)  Adding  (a)  to  (b)  produces  a  uniform  texture  pair  that  is  difficult  for  a 
human  to  preattentively  segment. 
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however,  the  resulting  image  (Fig.  3.3c)  is  difficult  to  segment.  If  purely  linear  mecha¬ 
nisms  were  involved  in  texture  segmentation,  one  would  expect  Figs.  3.3a  and  3.3c  to  be 
equally  discriminable.  Since  they  are  obviously  not,  some  form  of  nonlinearity  must  be 
present.  Therefore,  to  simulate  human  texture  perception,  a  nonlinearity  is  desirable. 
The  magnitude  operator  introduces  the  desirable  nonlinearity  into  the  filter. 

The  convolution  of  an  image  with  a  GEF  results  in  a  complex-valued  subimage. 
Bovik  et  al.  have  shown  that  the  amplitude  envelope  of  this  subimage  can  be  recovered 
by  computing  its  magnitude  and  that  the  resulting  amplitude  envelope  is  useful  for 
texture  segmentation  [53].  The  magnitude  operation  has  been  frequently  suggested  in 
the  literature  [53,  55,  60,  68.  69,  70]. 

Note  that  the  magnitude  operation  is  not  without  iiaw.  Aside  from  being  implau¬ 
sible  neurophysiologically,  Malik  and  Perona  have  shown  that  computing  the  magnitude 
makes  it  impossible  to  discriminate  certain  texture  pairs  [56].  Appendix  A  analytically 
verifies  this  assertion  but  then  goes  on  to  show  that  if  mimicking  human  perception  is  not 
essential,  then  a  wide  range  of  textures  can  be  segmented  without  using  a  nonlinearity. 
In  spite  of  shortcomings,  the  magnitude  computation  provides  a  convenient  analysis  tool 
and  serves  as  a  benchmark  for  comparing  alternatives. 


Chapter  4 


1-D  Analysis 


Using  a  mathematically  defined  1-D  texture  model,  this  chapter  analytically  shows 
that  applying  properly  configured  GEF-based  filters  to  textured  images  produces  output 
discontinuities  in  the  neighborhood  of  texture  boundaries  -  this  can  be  used  to  segment 
the  image.  Depending  on  the  nature  of  the  texture  difference,  this  output  discontinuity 
exhibits  certain  characteristic  signatures.  If  two  adjacent  textures  differ  in  local  spatial- 
frequency  content,  this  signature  exhibits  a  step  change  at  the  location  of  the  texture 
boundary.  If  the  two  adjacent  textures  differ  only  in  a  phase  shift,  the  signature  exhibits 
a  valley  at  the  location  of  the  texture  boundary.  The  following  sections  develop  this 
theory  in  detail.  First,  the  texture  model  is  defined.  Then,  analysis  shows  under  what 
conditions  the  various  types  of  signatures  occur. 

Although  the  1-D  model  has  significant  limitations,  its  leads  to  a  simple  analyt¬ 
ical  development  providing  useful  insight.  In  Chapter  5,  a  more  realistic  2-D  model  is 
presented,  leading  to  more  informative  results  at  the  expense  of  more  complex  analyses. 

4.1  The  1-D  Texture  Model 

Although  researchers  have  not  agreed  on  a  precise  definition  for  texture,  several 
descriptions  have  been  proposed  [12,  14,  71,  72].  Many  textures  can  be  described  as 
a  collection  of  similar  but  not  necessarily  identical  primitive  objects  arranged  in  some 
repeating  pattern.  Based  on  this  notion,  texture  will  be  modeled  as  a  collection  of  simple 
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objects  called  texels.  Groups  of  similar  texels  form  regions  of  homogeneous  texture. 
A  textured  image  consists  of  two  or  more  reg-  /here  texture  differences  between 
regions  are  induced  by  varying  the  type  and/or  organization  of  the  texels.  Using  Rao’s 
terminology,  this  approach  can  represent  a  variety  of  texture  types,  including  strongly- 
ordered,  weakly-ordered,  and  compositional  textures  [12].  This  analysis  does  not  address 
disordered  textures,  which  due  to  their  lack  of  structure,  cannot  be  accurately  modeled 
as  a  collection  of  texels.  Experimental  results  in  Chapter  8  indicate,  however,  that 
even  disordered  textures  can  be  effectively  discriminated/segmented  with  the  filter-based 
approach. 

For  convenience,  textured  images  are  divided  into  two  levels  of  complexity:  uni¬ 
form  and  nonuniform.  For  uniform  textures,  all  texels  within  a  region  are  identical  in 
shape  and  orientation  and  are  spaced  uniformly  (e.g..  Fig.  4.1a).  For  nonuniform  tex¬ 
tures,  the  texels  within  a  region  may  vary  randomly  in  orientation,  and  the  position  and 
shape  of  the  texels  may  be  perturbed  (e.g..  Fig.  4.1b). 

To  develop  a  simple  mathematical  model  based  on  this  description,  an  image  i  is 
constructed  consisting  of  two  textured  regions  1  and  2  with  the  texels  uniformly  spaced. 
For  simplicity,  assume  that  the  image  is  uniform  in  the  y  direction  (i.e.,  Vy,  i(x,  y)  = 
i{x)).  This  reduces  the  analysis  to  one  dimension.  In  1-D,  a  simple  texel  can  be  modeled 
as 


t{x)  =  f{x)UAx{x) 
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Fig.  4.1.  Examples  of  two  levels  of  texture  complexity:  (a)  uniform  and  (b) 
nonuniform. 


46 


where  II  is  a  gate  function  with  width  Ax 


nAi(x)  = 


1, 

0,  |x|>^ 


and  /  is  some  real-valued  function  of  x.  Consider  two  simple  texels  represented  as 
amplitude-modulated  gate  functions 


ti(i)  =  cos(a;ii  -1-  <^i)nAr(i)  (4.1) 

t2(x)  =  cos(a?2X  +  <A2)nAi(x)  (4.2) 


A  1-D  textured  region  ifc,  =  1,2,  can  be  formed  from  a  collection  of  N  equally  spaced 
texels  tk  as  follows;^ 

N 

ikix)  =  tk{x)  ♦  ^(x  -  I  Ah) 

l=i 

Each  texel  within  tk  is  a  truncated  sinusoid  with  frequency  Uk  and  phase  angle  <pk- 
Thus,  Uk  can  be  viewed  as  a  local  spatial  frequency  (i.e.,  local  to  a  texel),  which  occurs 
at  regular  intervals  throughout  the  region,  ik  can  therefore  be  characterized  by  this  local 
frequency,  and  so  Uk  is  referred  to  as  the  texture  frequency  of  textured  region  ik. 

A  1-D  “image”  i  consisting  of  two  nonoverlapping  textured  regions  can  be  con¬ 
structed  as  follows: 

j(x)  =  ii(i)  4-  i2(x  -  N Ah)  (4.3) 

Assume  for  simplicity  that  the  texel  spacing  is  the  same  in  both  regions,  and  that  each 
'To  avoid  technical  difficulties  in  the  analysis,  u  is  allowed  to  take  on  positive  or  negative  values. 
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region  contains  N  texels.  Note  that  when  u)\  ^  ^2,  i  consists  of  two  regions  that  differ  in 
local  spatial-frequency  content.  Texels  and  t2  also  contain  phase  components  4>i  and 
<t>2-  When  4>i  ^  (h  the  texels  will  differ  in  phase.  When  i  consists  of  two  regions  that  have 
identical  local  spatial-frequency  content  but  their  texels  differ  in  phase,  a  discontinuity  in 
image  phase  occurs  at  the  region  boundary.  This  condition  will  be  called  a  texture-phase 
difference.  Subsequent  analysis  will  show  that  a  difference  in  local  spatial-frequency 
content  between  textured  regions  causes  a  step  change  in  Gabor- filter  output,  whereas 
a  texture-phase  difference  produces  a  valley  in  the  filter  output. 

4.2  Step  Signature 

The  step  signature  is  characterized  by  a  step  change  in  the  Gabor-filter  output  m 
at  the  boundary  between  two  textured  regions.  It  occurs  whenever  there  is  a  difference 
in  average  local  spatial-frequency  content  between  two  regions.  The  occurrence  of  a  step 
can  be  demonstrated  by  analyzing  the  Gabor  filter  response  to  an  image  consisting  of 
regions  with  different  texels. 

Consider  two  simple  texels  with  identical  phase  components  (i.e.,  =  02  =  0  in 

(4.1)  and  (4.2)). 


<i(i)  =  cos(u;ii)n^i(i) 


(4.4) 


t2(l)  =  COs(u’2l)n^i(l) 


(4.5) 
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The  Fourier  transforms  of  ti  and  t2  are  given  by 
r.(w)  =  f  [sine 

r,(n,)  =  f  [sine  +  sine 

where  sinc(x)  =  sin(i)/x.  The  Fourier  transform  of  the  image  i  in  (4.3)  is 

I(u>)  =  A(w)  +  (4-6) 

where 

hiu)  = 

/=! 

2N 

/2(a;)  =  TjCu;) 

/=Ar+l 

A  Gabor  filter  (3.1)  is  now  applied  to  *.  Assume  that  the  center  frequency  of  a  1-D  GEF 
equals  wi,  the  texture  frequency  for  region  »i.  The  Fourier  transform  for  this  1-D  GEF 
is 

(4.7) 

Thus,  the  GEF  is  a  bandpass  filter  with  center  frequency  wi  and  bandwidth  controlled 
by  a.  Assume  also  that  Ija  is  approximately  equal  to  the  main  lobe  width  of  T\  and  T2 
(i.e.,  ^  ss  ^)-  This  ensures  that  the  bandwidth  of  the  filter  includes  most  of  the  energy 
of  T\  around  .  Let  us  also  assume  that  4>;i  0  so  that  the  sine  functions  making  up 

T\  (centered  at  ui  and  -wi)  do  not  significantly  overlap.  Finally,  assume  that  wi  and 


u;2  are  sufficiently  separated  (e.g.,  assume  |a;i  -wjI  >  f).  Then,  applying 
image  (4.6)  yields 

i{u)  =  «  ^,,(u.)^sinc  ^ 

Since  H^^  is  a  function  of  w  —  a>i ,  we  can  define 

S^TT  ,  •  f{u-u)i)Ax\ 

F{u  -ui)  =  sine  ( - - - j 


Hence,  the  spatial-domain  form  of  the  filtered  image  is  given  by 


l=l 


Now, 


F{u  -  ^  /(x  -  xo)e^">("^-^'>> 


is  a  Fourier  transform  pair.  Therefore, 

N 

i=i 


where  f(x)  = 


(w)  to  the 


(4.8) 


(4.9) 


(4.10) 


(4.11) 


The  complex  exponential  in  (4.11)  will  cause  oscillations  in  i  if  it  is  not  eliminated. 
Now  suppose  wiA/i  =  n2x  for  some  integer  n  (the  implications  of  this  assumption 
will  be  discussed  shortly).  Then,  V/,  the  complex  exponential  reduces  to  and 
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Let  us  now  examine  /.  The  function  /  is  awkward  to  deal  with  analytically, 
but  we  can  get  an  intuitive  feel  for  its  shape.  Referring  to  (4.8),  we  see  that  F  is  the 
product  of  a  Gaussian  and  a  sine  function.  Since  the  inverse  Fourier  transform  of  a 
Gaussian  is  a  Gaussian  and  the  inverse  Fourier  transform  of  a  sine  is  a  gate  function, 
their  multiplication  in  the  frequency  domain  is  equivalent  to  the  convolution  of  a  gate 
with  a  Gaussian  in  the  spatial  domain.  For  a  large  relative  to  the  texel  spacing  A/i,  /  is 
a  greatly  blurred  gate  function  (i.e.,  a  gate  function  with  tapered  shoulders).  Then  the 
sum  of  offset  /’s  approaches  a  constant,  say  C  (actually  a  DC  value  with  some  ripple), 
over  the  range  0  <  i  <  A^A/i.  Thus, 


i(x) 


if  0  <  I  <  iVA/i 
0  otherwise 


To  complete  application  of  the  Gabor  filter,  we  compute  the  magnitude  of  i: 


1C  ifO<j<iVA/i 
0  otherwise 

This  implies  that  the  output  of  the  Gabor  filter  is  approximately  a  constant  value  over 
region  1,  where  the  texture  frequency  matches  the  filter  center  frequency,  and  zero  over 
region  2.  Thus,  the  Gabor  filter  output  will  be  in  the  form  of  a  step,  with  the  transition 
occurring  in  the  vicinity  of  the  texture  boundary. 

We  now  return  to  the  issue  of  assuming  that  wiAh  =  n2ir  for  some  integer  n.  If 
we  substitute  2t/i  for  wi,  this  equation  reduces  to  f\  =  nfAh.  Since  Ah  is  the  texel 
spacing,  this  is  equivalent  to  saying  that  the  filter’s  center  frequency  is  a  multiple  of 
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the  frequency  of  occurrence  of  the  texels.  Thus,  for  a  Gabor  filter  to  produce  a  step 
signature  with  a  small  amount  of  ripple,  /i  should  be  a  multiple  of  the  reciprocal  of  the 
texel  spacing. 

The  previous  analyses  show  that  fi  is  subject  to  two  constraints:  (1)  /i  should 
equal  one  of  the  texture  frequencies,  and  (2)  fi  should  be  a  multiple  of  the  reciprocal  of 
the  texel  spacing.  For  the  cosine  texture  used  in  this  discussion,  satisfying  both  of  these 
constraints  is  not  always  possible,  since  the  texel  spacing  is  not  necessarily  related  to 
either  texture  frequency.  For  more  complex  textures,  however,  the  local  spatial-frequency 
content  of  the  texels  is  typically  broadband.  In  that  case,  the  goal  is  to  select  a  local  2-D 
frequency  component  (both  radial  frequency  and  orientation)  that  differs  significantly  in 
energy  between  the  texels  of  different  regions.  This  choice  of  frequencies  makes  it  easier 
to  satisfy  both  constraints. 

One  additional  comment  should  be  made  regarding  o.  We  have  assumed  it  to 
be  large  to  reduce  output  ripple.  However,  it  must  not  be  made  too  large  or  we  will 
lose  accuracy  in  locating  the  region  boundary.  If  a  is  too  large,  then  the  tail  of  the  last 
gate  function  will  excessively  extend  into  the  adjacent  region.  This  will  cause  error  in 
estimating  the  region  boundary.  Thus,  the  choice  of  a  is  a  tradeoff  between  the  amount 
of  output  ripple  and  texture-boundary  resolution. 

4.3  Valley  Signature 

The  magnitude  operation  in  the  Gabor  filter  (3.1)  discards  the  phase  of  the  GEF- 
filtered  image,  resulting  in  a  loss  of  information.  (Appendix  A  discusses  the  issue  of 
phase  and  alternatives  to  magnitude  computation.)  This  section  shows  that  certain 
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phase  differences  can  be  detected  without  using  the  phase  component.  This  is  in  contrast 
to  the  approach  of  Bovik  et  al.  [53],  where  phase  information  is  extracted  explicitly  by 
demodulation  of  the  channel  phase  component  (cf.  [53,  66,  68]).  The  approach  is  to 
design  a  suitable  Gabor  filter  that  detects  discontinuities  in  the  filter  output  m  caused 
by  abrupt  changes  in  texture  phase. 

Consider  the  textured  image  of  Fig.  1.3b.  The  two  uniform  regions  are  identical 
but  offset  vertically  (the  offset  can  also  be  horizontal).  Thus  the  Fourier- transform 
magnitudes  of  the  two  regions  are  identical,  but  their  respective  phase  characteristics 
differ.  This  type  of  texture  difference  will  be  referred  to  as  a  texture-phase  difference. 
(This  phenomena  could  equivalently  be  viewed  as  a  collection  of  different  texels  near  the 
texture  boundary,  but  analysis  suggests  that  a  difference-in-phase  interpretation  is  more 
appropriate.)  Whereas  a  difference  in  local  spatial-frequency  content  between  textured 
regions  causes  a  step  change  in  the  filter  output,  a  texture-phase  difference  produces  a 
vadley  in  the  output  m. 

To  show  how  the  valley  signature  can  arise,  again  define  two  simple  texels,  ti  and 
fj  as  in  (4.1)  and  (4.2),  and  let  =  W2»  and  4>i  =  0.  Then,  U  and  <2  are  amplitude- 
modulated  gate  functions  having  the  same  frequency  a;i  but  differing  in  phase: 


fl(l) 

=  cos(u;ii)n^a;(i) 

(4.12) 

t2{x) 

=  cos(u;iX  -1-  <^)n^j:(i) 

(4.13) 
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where  <t>  =  <i>2-  Thus,  the  Fourier  transforms  of  t\  and  <2  are  given  by 

Let  i  again  be  a  1-D  image  consisting  of  two  nonoverlapping  equally  spaced  collections 
of  ti  and  t2i  as  in  (4.3).  The  Gabor-filter  output  for  t  will  be  derived  next.  The  Fourier 
transform  of  i  is 

N  2N 

l=l  l=N+l 

Again  assume  that  the  GEF  is  narrowband,  centered  at  wi,  and  that  u>i  >  0.  Then 
applying  the  filter  as  defined  in  (4.7),  to  the  image  approximately  retains  only 
terms  containing  u  -  ui-  That  is,  after  filtering 

r  iV  2N  ' 

iiu)  =  F{u  -  ui)  (4.14) 

[/=!  l=N+l 

where  F  is  given  by  (4.8).  From  (4.10), 

N  2N 

»(*)  «  H  /(i  -  /Ah)e^‘^‘(*-'^'*)  +  e--’* 

1=1  l=/V+l 


Again  assume  that  /i  =  u\/(2t)  =  n/Ah  for  some  integer  n;  i.e.,  the  GEF’s  center 
frequency  is  a  multiple  of  the  reciprocal  of  the  texel  spacing.  Then,  V/  the  complex 
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exponentials  within  (4.14)  reduce  to  cJ*"**.  Thus, 


N  2N 

l=l  l=N+l 


(4.15) 


By  the  definition  of  the  image  i  in  (4.3),  we  know  that  the  texture  discontinuity  occurs 
in  the  vicinity  of  i  =  =  NAh  +  Let  us  compare  the  value  of  i{xd)  to  values  of 

i(i)  at  points  far  removed  from  xj.  First,  recall  from  before  that  F{u-u}i)  is  a  Gaussian 
multiplied  by  a  sine  and  resembles  a  gate  function  with  tapered  shoulders.  Now,  at  xj, 
the  sum  of  /’s  from  the  left  sum  becomes 


^fAh\  ,/3A/»N  ^/5A/i\ 


(4.16) 


and  the  corresponding  terms  from  the  right  sum  are 


('-Ah\  ,/-3AA'\  .f-5Ah\ 


Observe  that  /  is  symmetric  about  Ah/2  and  that  the  dominant  contribution  to  the 
sums  are  f{Ah/2)  and  /(-A/i/2),  which  are  equal.  Thus, 


*(*<<)  «  (/(Ah/2)  +  «) 
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where  e  represents  the  sum  of  the  less  significant  terms.  Computing  the  complex  mag¬ 
nitude  completes  the  application  of  the  Gabor  filter.  Dropping  terms  involving  e  gives 


m(xj)  =  Gf(i(xj))  =  |i(xd)|  «  |/(AV2)|  •  \S\ 


where 


S  =  \J(cos{u}iXd)  +  cos(a;ii<i  -  <^))2  -f  (sin(u;iid)  -|-  sin(u;iid  -  <^))2 


Note  that  |5|  <  2,  V<^  such  that  \4>\  >  0.  Hence, 


m{xd)  =  G}{i{xd))  <  2|/(A/i/2)| 


Consider  now  a  position  x  <  xj.  Contributions  to  i  will  then  be  predominantly 
from  the  left  sum  in  (4.15),  which  is  expanded  in  (4.16).  In  this  case,  we  have 


i(x)  w  (2/( Aft/2)  -I- 


After  computing  the  magnitude  and  dropping  €  terms 


m(x)  =  G/(j(x))  =  |i(x)|  w  |2/(Aft/2)|  (4.17) 


Similarly,  for  x  >  xj 


i(x)  «  (2/( Aft/2)  -I-  f)c' 
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m(x)  is  the  same  as  in  (4.17).  Thus,  we  see  that,  for  a  filter  tuned  to  the  texture 
frequency,  the  value  of  the  filter  output  m  at  the  discontinuity  is  less  than  it  is  at  other 
points.  Changes  in  texture  phase  can  thus  be  detected  by  locating  valleys  in  the  filter 
output  m.  In  certain  cases  the  discontinuity  in  filter  output  is  a  ridge  rather  than  a 
valley.  The  1-D  texture  model,  however,  is  insufficient  to  derive  the  rid  ^e  signature. 
In  the  next  chapter,  a  2-D  texture  model  is  presented  that  allows  for  a  more  detailed 
analysis  of  the  step  and  valley  signatures  and  explains  the  origin  of  the  ridge  signature. 


Chapter  5 


2-D  Analysis 

The  previous  1-D  analysis  showed  that  the  application  of  a  properly  tuned  Gabor 
filter  to  a  textured  image  produces  either  a  step  or  valley  signature  at  texture  boundaries. 
Using  a  2-D  texture  model,  similar  to  one  proposed  by  Clark  and  Bovik  [68],  this  chapter 
provides  a  more  detailed  development  of  the  step  and  valley  signatures  and  explains  how 
the  ridge  signature  can  occur  at  a  texture-phase  discontinuity.  The  analysis  also  demon¬ 
strates  the  existence  of  certain  signature  anomalies  called  overshoot  and  undershoot  and 
shows  how  they  originate.  The  more  general  2-D  model  also  allows  for  evaluation  of 
asymmetric  filters  (i.e.,  <7,  ^  <Ty  in  (3.3))  and  provides  concrete  guidelines  for  selecting 
Gabor-filter  parameters.  These  issues  are  discussed  in  detail  in  Section  6.1.  Although 
quantitative  analysis  is  still  limited  to  uniform  textures,  the  2-D  model  provides  for  a 
qualitative  understanding  of  nonuniform  textures.  Section  5.2.3  discusses  nonuniform 
textures  and  describes  a  fourth  signature  type  resulting  from  the  texel  variation  inherent 
in  these  textures. 

5.1  2-D  Texture  Model 

Section  4.1  developed  a  1-D  texture  model  based  on  collections  of  texels.  This 
section  presents  a  more  robust  2-D  version  of  that  model.  .A.s  in  Section  4.1,  an  image 
i  is  formed  from  two  uniform  textures  and  >2-  For  the  time  being,  assume  that  the 
two  textures  I'l  and  12  consist  of  te.xels  /i  and  /2  that  differ.  Later,  as  necessary,  these 


conditions  will  be  varied. 
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Define  a  texel  ti(x,y)  as  any  real  deterministic  function  that  has  a  Fourier  trans¬ 
form  Ti(u,  v)  that  exists  (singularity  functions,  such  as  impulses,  are  allowed  to  appear 
in  T\{u,v)).  A  uniform  texture  I'l  made  up  of  an  array  of  texels  can  be  represented 

by 

ii{x,y)  =  ti(x,y)*'^6{x  -  kAx,y-  lAy) 

U 


where  Ax  is  the  texel  period  in  x,  Ay  is  the  texel  period  in  y,  and  the  Fourier  transform 
of  is 


4ir^ 

AxAy 


Ti(u.  v) 


2i:k  2nl\ 
A.r'^  Ay) 


(5.1) 


Ii  consists  of  a  collection  of  weighted  impulses  whose  signal  energy  are  concentrated  at 
the  discrete  set  of  frequencies  (,2TrklAx.2wl / Ay).  These  frequencies  will  be  referred  to 
as  the  harmonics  of  I\.  A  uniform  textured  region  with  limited  spatial  extent  ti  can  be 
formed  from  ii  by 

i\(x.y)  =  nr.,(x.(/)/i(.r.  (/)  (5.2) 


where 


nr,,(x,y)  =  < 


1.  |.r|  <  ^.1(/|  <  f 
0.  otherwise 


is  the  2-D  gate  function.  Region  ti  has  support  r  x  s  and  is  centered  at  (0,0).  The 
Fourier  transform  i\  of  ti  is 


i\[u,v)  =  .7^[ii(x,y)l  = 


1 


[5(1/.  r)  +  /i(u,v)] 


(5.3) 
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where 

S{u,v)  =  ^[nr,4(x,y)]  =  5rsinc(ur/2)sinc(rs/2)  (5.4) 

and  sinc(x)  =  sin(i)/x.  Consider  a  second  uniform  texture  12  made  up  of  texels  t2,  where 
<2(1,  y)  is  again  any  real  deterministic  function  that  has  a  Fourier  transform  r2(u,  r)  that 
exists  (singularity  functions  again  will  be  allowed  in  T2(u,  v)).  Then, 


*2(2;,  y)  =  hix,  y)*^  6{x  -  kAx,  y  -  I  Ay)  (5.5) 

k,l 

A  uniform  textured  region  12  of  support  r  x  s  and  centered  at  (r,  0)  is  given  by 


*2(2;,  y)  =  nr.,(x  -  r,  j/)i2(x,  y)  (5.6) 

Then, 

/2(«,  n)  =  ^[t2(x,  y)]  =  ^[5'(«,  v)e"-'“’’]  *  hiu,  v)  (5.7) 

where  /2  is  similar  to  Ii  in  (5.1),  except  that  Ti  is  replaced  by  T2.  The  regions  ii  and  12 
can  be  combined  to  form  a  finite-extent  textured  image 


i(x,j/)  =  ti(i,y)-f  i2(x,y)  (5.8) 

Thus,  i  consists  of  two  adjacent  nonoverlapping  textured  regions  ii  and  *2.  See  Fig.  5.1. 
The  image  i  is  spatially  limited  as  a  rectangular  function  to  make  analysis  tractable 
(e.g.,  well-defined  sine  functions,  such  as  (5.4),  occur  frequently  during  the  subsequent 
analysis).  Also,  a  spatially  limited  t  conforms  to  a  real-world  image  setting.  Clark  and 
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(*=r/2) 


3r/2  X 


Fig.  5.1.  Bipartite  textured  image  model.  Image  t(z,y)  has  support  2r  x  s  and 
is  centered  about  (r/2,0).  Texture  *1(1,  y)  is  made  up  of  texels  ti(i,y)  and  12(1,  y) 
is  made  up  of  texels  t2{x,y). 
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Bovik  employed  a  similar  model,  but  their  analysis  used  general  indicator  functions  [68]. 
The  analysis  presented  here  leads  to  somewhat  more  tractable  results  and  also  more 
easily  leads  to  an  understanding  of  specific  filter-output  behavior. 

Now, 


.?^[i(T.y)]  =  I(u.v)  =  /i(a.r)-|-  /2(u,u)  (5.9) 

where  (evaluating  (5.3)  and  (5.7)), 


II 

27r  ^2~k  2x1^ 

(5.10) 

hiu,v)  = 

AxAy  ^  \Ax  Ay/ 

(5.11) 

Sk.l  = 

S(u-  2Tk/Ax\  V  -  2tr//Ay) 

(5.12) 

and  5(u,t;)  is  given  by  (5.4).  Or 


/(u,t;)  = 


2ir 


E  su  (i 

k,l  ^ 


(t 

'Iirk  27r/' 

[T, 

Ax  '  Ay. 

2nk  2iri  \ 
Ax  '  Ay.  / 


(5.13) 


Observe  that  Ii  consists  of  a  collection  of  scaled  2-D  sine  functions  centered  at  the  har¬ 
monics  (27r/i:/Ai,  2x//Ay).  The  amplitude  of  the  sine  5),,/  at  harmonic  (27r^'/^^i27r// Ay) 
is  proportional  to  the  value  of  the  Fourier  transform  of  the  texel  Ti  evaluated  at  that 
harmonic.  /2  also  consists  of  a  collection  of  scaled  2-D  sine  functions  centered  at  the 
harmonics.  The  amplitudes  of  the  sines  for  I2.  however,  are  proportional  to  T2  rather 
than  Ti,  and  their  phase  components  are  influenced  by  a  complex  phase  factor.  Thus, 
by  (5.13),  /  is  a  sum  of  scaled  sines  Skj.  Each  sine  consists  of  a  component  from  each 
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texture  region.  Or,  more  colloquially,  each  (kj)  component  of  I  consists  of  a  pair  of 
sines,  one  for  each  texture. 

Thus,  the  texture  segmentation  problem  is  to  find  the  boundary  separating  regions 
ii  and  12  in  image  i.  Per  the  model’s  construction,  the  boundary  separating  these  two 
textures  is  the  line  segment  given  by  x  =  r/‘2  and  \y\  <  s/2.  The  goal  is  to  understand 
how  the  Gabor  filter  (3.1)  will  help  in  locating  this  boundary. 

5.2  Characterizing  Gabor-Filter  Outputs 

This  section  shows  analytically  that  the  application  of  Gabor  filters  to  textured 
images  produces  outputs  that  exhibit  discontinuities  in  the  neighborhood  of  texture 
boundaries.  This  is  shown  within  the  context  of  the  texture  model  defined  in  the  Sec¬ 
tion  5.1.  The  analysis  begins  with  those  texture  configurations  that  produce  a  step 
signature  and  is  followed  by  an  analysis  of  texture  types  that  produce  a  valley  or  ridge 
signature. 

5.2.1  Textures  Consisting  of  Different  Texels;  Step  Signature 

This  section  derives  conditions  when  the  application  of  Gabor  filter  (3.1)  to  a 
uniformly  textured  image  produces  a  step  signature.  The  step  signature  is  characterized 
by  a  step  change  in  the  Gabor-filter  output  m  at  the  boundary  between  two  textured 
regions.  This  signature  type  occurs  when  a  properly  tuned  Gabor  filter  is  applied  to  a 
uniformly  textured  image  that  contains  two  textures  whose  constituent  texels  and  <2 
differ. 


To  derive  this  result,  consider  the  outcome  of  applying  a  Gabor  filter  (3.1)  to  the 
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textured-image  model  I  in  (5.9)  (or  equivalently  i  in  (5.8)).  The  goal  is  to  design  a  filter 
that  enables  “easy”  localization  of  the  texture  boundary.  Analytically,  the  approach 
is  to  design  a  Gabor  filter  that  passes  the  image  energy  centered  about  one  harmonic 
(kj).  This  is  equivalent  to  passing  one  and  only  one  scaled  sine  in  (5.13),  where 
the  sine  draws  contributions  from  each  texture;  i.e.,  design  a  filter  that  passes  one  sine 
pair  occurring  at  some  harmonic  (k.l).  Each  sine  in  the  pair  represents  a  gate  function 
in  the  spatial  domain.  Each  gate  coincides  with  one  of  the  two  region  boundaries,  and 
the  difference  in  gate  amplitude  is  proportional  to  the  amplitude  difference  between  the 
two  sines  (i.e.,  [Ti  —  T^l).  By  filtering  out  a  sine  pair  whose  sines  differ  significantly  in 
amplitude,  a  filter  output  is  produced  that  is  approximately  constant  within  a  region, 
but  differs  between  regions,  thus  forming  a  step  signature. 

Designing  a  Gabor  filter  involves  specifying  the  five  parameters  (t/^  V,  (Ti,  (Tj,,0) 
of  the  GEF  H  in  (3.4).  To  pass  the  single  sine-pair  at  harmonic  indices  {k,I),  the 
center  frequency  {U,V)  of  H  is  specified  as  C  =  2nk/:lx,  V'  =  2nifAy.  The  bandwidth 
of  H,  determined  by  (ax,<Ty),  is  then  selected  so  that  H  passes  most  of  the  image 
energy  centered  about  harmonic  (k.l),  while  also  largely  rejecting  the  image  energy  at 
adjacent  harmonics.  Since  harmonic  spacing  is  proportional  to  te.xel  spacing  (Ax,  Ay), 
the  ratios  (axf  Ax,ayl Ay)  determine  this  filter  characteristic.  Clearly,  the  choice  of 
((Tx/Ax,  Oy! Ay)  is  a  tradeoff  between  attenuation  of  the  desired  harmonic  and  a  rejection 
of  adjacent  harmonics.  The  consequences  of  this  tradeoff  are  discussed  in  Section  6.1. 

Applying  H  to  I  gives 


//( H.  r)  =  H(  u.  r)I(  a.  r) 
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Since  H  has  been  designed  to  pass  only  those  frequency  components  in  the  neighborhood 
of  (£/,  V),  we  can  write 

/  (u,  y)  «  -U,v-V)  {r,  +  (5.14) 

where  Ti  and  T2  are  abbreviations  for  Ti{U,V)  and  T2{U,V).  Observing  that  H  in  (3.4) 
is  a  function  oi  u  —  U  and  v  —  V,  the  function  Sj  is  defined  as 

Sjiu  -U,v-V)  =  Hiu,  v)Siu  -U,v-V)  (5.15) 


where 


T-^[Sfiu  -U,v-V)]  = 


and 


Sf{x,y)  =  ^ 


(5.16) 


By  substituting  5/  into  (5.14),  the  inverse  Fourier  transform  of  //  can  be  expressed  as 


y)  =  [r,s/(x,  y)  +  T2Sf{x  -  r,  y)]  (5.17) 

■'  Ax  Ay 


Computing  the  magnitude  of  i j  completes  the  application  of  the  Gabor  filter  and  gives 


m{x,y)  =  !»/(*, y)l  =  ^xAy^"^  +  B-\-C 


(5.18) 
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where 


A  =  \Ti\^s){x,y) 

B  =  \T2\'^s){x~T,y) 

C  =  (Ti*T2  +  TiTJ )s/(i,  y)s}{x  -  T,  y) 

(It  can  be  seen  from  (3.4),  (5.4),  (5.15),  and  (5.16)  that  Sf  is  real.) 

To  understand  the  behavior  of  m,  we  first  need  to  determine  Sf.  Sj  equals  a 
sine  multiplied  by  a  Gaussian.  Thus,  in  the  spatial  domain,  sj  can  be  expressed  as  the 
convolution  of  a  Gaussian  with  a  gate  function: 

fs/2  rT/2 

Sfix,y)  =  /  /  g{x  -  a,y-  f3)dadl3  (5.19) 

J-s/2  J-t/2 
/■»/2  /•3r/2 

Sf{x-r,y)  =  /  /  g{x  -  a,y  -  I3)dad(3  (5.20) 

J-sf2  Jr/2 

where  g  is  the  Gaussian  (3.3).  The  quantity  m  can  now  be  evaluated  by  examining 
its  behavior  at  the  texture  boundary  and  at  points  far  removed  from  the  boundary  (or 
equivalently  points  within  the  interiors  of  each  texture).  Assume  that  the  region  width 
r  in  the  z  direction  is  large  relative  to  cr^,  and  the  region  height  s  in  the  y  direction  is 
large  relative  to  Cy.  Then,  for  points  away  from  the  textured  image’s  outer  boundary 
and  left  of  the  texture  boundary  (i.e.,  |y|  <  s/2  and  |x|  <  r/2  for  points  in  region  1), 
Sf{x,y)  «  1  and  Sf(x  -  r,  y)  «  0  per  (5.19)  and  (5.20).  Then  m  a  Similarly, 

for  points  to  the  right  of  the  texture  boundary  in  region  2,  m  a  l^zl- 


66 


Now,  at  the  texture  boundary  (i  =  r/2),  the  filter  output  m  is 


m(r/2,y) 


+  |r2|V4  +  (Tj-Tj  +  T,T^)IA 

+  +  T.l 


(5.21) 


since  sj{x,y)  and  Sf{x  -  r,y)  both  a  1/2  at  a:  =  r/2.  Now  suppose  that  Tj  and  Tj  are 
both  real  and  positive.  Then  m{r/2,y)  becomes  the  average  of  values  far  to  the  right 
and  left  of  the  texture  boundary,  and  (5.18)  can  be  rewritten  as 


y)  =  y)  +  ^2  ~  '  ’  y^  +  2TiT2S/(a;,  y)sf{x  -  r,  y))^/^ 

=  ;^|^(^i«/(a:,y)  +  r2«/(x- r,y)) 

Observing  that  Sf{x  -  r,  y)  as  1  -  sy(x,y),  we  see  that  m  is  a  linear  function  of  sj.  Since 
Sf  is  the  integral  of  a  Gaussian,  its  shape  is  similar  to  a  sigmoid  function.  Thus  m  is 
also  shaped  like  a  sigmoid  in  the  neighborhood  of  the  texture  boundary.  Assuming  that 
|Ti|  ^  ir2|,  m  is  given  by  a  constant  value 


27r 

AxAy 


\Ti\ 


over  region  1  and  a  constant  value 


A2 


27r 

AxAy 


\T2\ 


(5.22) 


(5.23) 
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over  region  2  with  a  sigmoid  transition  between  regions;  i.e., 


-^1) 


X  <  r/2 


sigmoidal  transition 

< 

from  Ai  to  A2,  x  near  r/2  (texture  boundary) 
A2,  X  >  r/2 


Thus  m  resembles  a  step  function  with  the  transition  occurring  near  the  texture  boundary. 

Suppose  now  that  Ti  and  T2  are  negative  or  complex.  Then  m  can  take  on 
values  <  min(Ai,>l2)  or  >  max(Ai,i42)  t^ear  the  texture  boundary.  These  possibilities 
are  referred  to  as  undershoot  and  overshoot  respectively.  To  see  how  undershoot  can 
occur,  (5.21)  shows  that  near  the  texture  boundary  m  is  proportional  to  (Ti  +  T2). 
Thus,  if  Ti  and  T2  are  negative  or  complex,  the  magnitude  of  their  sum  can  be  less 
than  the  magnitude  of  either  component.  Overshoot  can  occur  if  the  Gabor-filter  center 
frequency  (f/,  V)  is  not  equal  to  one  of  the  harmonics  of  I.  The  phenomena  of  undershoot 
and  overshoot  need  not  overly  complicate  the  detection  of  the  texture  boundary.  They 
are  illustrated  in  Chapter  8  and  discussed  analytically  in  Appendix  B. 


5.2.2  Textures  Consisting  of  Identical  Texels,  but  Exhibiting  a  Texture- 
Phase  Difference:  Valley  and  Ridge  Signatures 

This  section  shows  that  the  application  of  a  Gabor  filter  to  a  textured  image 
exhibiting  a  texture-phase  difference  (as  defined  in  Section  4.3)  produces  a  valley  in  the 
Gabor-filter  output  m  when  the  GEF  is  properly  tuned:  if  an  improperly  tuned  GEF  is 
used,  a  ridge  occurs  in  m  at  the  texture  boundary. 
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5. 2.2.1  Valley  Signature 

Again,  the  goal  is  to  design  a  filter  that  enables  easy  localization  of  the  texture 
boundary.  Analytically,  the  procedure  is  to  design  a  Gabor  filter  that  passes  the  image 
energy  centered  about  one  harmonic  {2irk/Ax,2irl/Ay).  This  is  equivalent  to  passing 
one  and  only  one  sine  pair  centered  about  some  harmonic  (2irk/ Ax,2irl/ Ay).  In  this 
case,  the  amplitudes  of  the  sines  are  identical.  The  offset  regions,  however,  produce  a 
phase  shift  V'  (^ven  in  (5.26))  between  the  sines,  resulting  in  a  drop  in  filter  output  given 
by  (5.29)  near  the  texture  boundary. 

First,  the  texture  model  of  Section  5.1  is  modified  to  fit  the  texture-phase- 
difference  scenario.  Define  a  texel  as  before  and  construct  a  uniform  textured  region 
ii  as  in  (5.2).  Define  a  second  texel  ^2  equ«d  to  ii  but  shifted  6x  in  the  x  direction  and 
6y  in  the  y  direction,  where  Q  <  6x  <  Ax  and  0  <  Sy  <  Ay.  Then 

h(x,y)  =  ii(x-^x,y-iy) 

A  uniform  texture  t2  whose  texels  are  periodic  in  x  and  y  can  be  constructed  from  this 
texel  as  shown  in  (5.5),  and  a  uniform  textured  region  »2  of  support  rxs  and  centered  at 
(r,0)  can  be  formed  from  12  as  in  (5.6).  Thus  a  uniform  textured  image  t  that  exhibits 
a  texture-phase  difference  at  x  =  r/2  can  be  formed  similar  to  (5.8): 


*(*,y)  =  *i(*,y)  +  »2(x,y) 
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•^[‘(^»y)]  is  then  similar  to  (5.9): 


/(«,«)  =  /i(u,i;)  +  /2(u,i;) 


ii{u,v)  is  given  by  (5.10),  but  /2(ti,»)  differs  from  (5.11),  since 


T2(u,t;)  =  T,(«,v)e--'(“^^+"^s') 


Thus 


I{U,V)  =  -^^YSklTi  (H.  ^-Mn-2Tk/Ax)^-i2^(kSx/Ax+lSy/AyA 

AxAy  '  \Ax  AyJ  *•  J 

(5.24) 

Let  the  GEF  H  have  center  frequency  {U,V),  where  U  =  2irk/Ax  and  V  = 
2irl/Ay  for  some  (Ai,/),  and  select  {ax/ Ax^Oyj Ay)  as  in  Section  5.2.1.  Applying  H  to  I 
approximately  passes  only  the  sine-pair  centered  at  {U,V): 


If{u,v)  =  H{u,v)I{u,v) 

«  V)H{u,v)S{u  -U,v-V){l  +  e-M«-t/)e-j2,r(Wx/Wiy/Ay)J, 


Defining  5/  as  in  (5.15),  the  inverse  Fourier  transform  of  If  is 


27r 


«/(x,y)  =  s;(i,y) -t- s/(i  -  r,  y)e-J2,(Wr/Ax-hJ«y/Ay) 


(5.25) 


Let 
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rj}  =  2ir{kSx  /  Ax  +  ISyfAy)  (5.26) 

0  represents  the  total  relative  phase  shift  between  regions  1  and  2.  Computing  the 
magnitude  of  if  completes  the  application  of  the  Gabor  filter  and  gives 


m{x,y)  =  C  Sf{x,y)  + Sf{x  -  r,y)e 


=  Cyfsj{x.y)  +  s^j{x  -  r,y)  +  2sf{x,y)sf{x  -  r,2/)cos^  (5.27) 


where 

Consider  the  behavior  of  m.  Assume  that  a  phase  shift  occurs;  i.e.,  V(/t,/),  V’  ^ 
multiple  of  27r  or,  equivalently,  choose  some  (kj)  such  that  cos0  ^  1.  (This  holds, 
because  of  the  restrictions  placed  earlier  on  Sx  and  Sy.)  The  image  does  not  exhibit  a 
phase  discontinuity  in  the  y  direction;  so  in  subsequent  analyses,  it  will  be  assumed  that 
y  is  far  removed  from  the  image’s  outer  boundaries  (i.e.,  |y|  <  s/2).  Consider  m  over 
three  regions.  The  first  region  consists  of  those  values  of  x  such  that  |x|  <  r/2  (i.e., 
points  in  region  1  far  from  both  the  texture  boundary  and  the  image’s  outer  boundary 
-  see  Fig.  5.1).  In  this  case,  (5.19)  and  (5.20)  indicate  that,  for  r  large  relative  to  Oi, 
s/(x,y)w  1,  and  s/(i  -  r,  y)  ss  0.  Thus,  from  (5.27),  m  «  C.  The  second  region  consists 
of  those  values  of  x  for  7-/2  <  i  <  3r/2  (points  in  region  2).  In  this  case,  s/(x-r,  y)  w  1, 
Sf{x,  y)  S5  0,  and  m  K  C .  The  last  region  is  in  the  neighborhood  of  the  texture  boundary 
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(x  2s  r/2).  At  I  =  r/2,  for  large  r,  sj{x,y)  as  sj{x  —  r,  j/)  ss  1/2,  and  from  (5.27), 


m 


(r/2,  y)  as  C ^0.5(1  +  cos^) 


(5.29) 


Summarizing, 


m(i,-) 


C,  I  <  r/2 

^  Cy0.5(l  +  cos^),  I  near  r/2  (texture  boundary) 
C,  X  >  r/2 


Note  that  near  the  t  ture  boundary  (i  as  r/2),  m(i,  •)  <  C  is  required,  because  of  the 
weighting  factor  \/0.5(l  +  cos  rp).  Thus,  for  this  situation,  a  valley  signature  occurs  near 
the  texture  boundary. 

When  no  phase  shift  exists  between  the  two  regions,  the  transition  takes  on  its 
maximum  value  C,  which  is  the  same  value  as  for  points  far  removed  from  the  transition. 
This  is  expected,  since  without  a  texture-phase  shift,  the  two  regions  are  indistinguish¬ 
able.  If  tl)  =  K  (maximum  texture-phase  difference  between  two  regions),  the  value  of  m 
at  the  texture  boundary  is  0  and  a  minimum  valley  results.  Note  from  (5.26)  that  the 
depth  of  the  valley  depends  on  the  ratios  SxjAx  and  byfAy.  These  ratios  represent  the 
amount  of  texture-phase  shift  in  x  and  y  relative  to  the  texel  periods.  Thus,  the  greater 
the  texture-phase  shift,  the  deeper  the  valley. 
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5.2.2. 2  Ridge  Signature 

With  the  Gabor  filter  designed  as  indicated  above,  a  ridge  signature  cannot  occur 
near  the  texture  boundary.  It  can  be  shown,  however,  that  if  the  Gabor  filter  is  tuned  to 
a  frequency  other  than  an  harmonic,  a  ridge  is  produced. 

To  see  this,  consider  the  frequency  domain  representation  of  the  GEF-filtered 

image 

If{u,v)  =  H{u,v)I{u,v) 

In  this  case,  let  U  =  2irklAx  +  SU,  V  =  2iri/Ay  +  6V,  where  SU  and  6V  are  chosen  so 
that  2-!rkfAx  <  U  +  6U  <  2ir(k  +  1)/Ax  and  2iri/Ay  <V  -^SV  <  2ir{i  +  1)1  Ay.  Now, 
the  inverse  Fourier  transform  ij  has  the  same  form  as  in  (5.25),  but  now  sj  represents 
a  gate  function  convolved  with  a  GEF  having  center  frequency  {6U,bV)  rather  than  a 
gate  function  convolved  with  a  simple  Gaussian.  Convolving  a  gate  function  with  a  GEF 
produces  a  complex  quantity.  So,  sj  becomes  complex.  Thus,  computing  the  magnitude 
of »/  as  in  (5.27)  produces 

m(i,  y)  =  Cy/PrP;  +  PoPS  +  +  PoPfeJ'l'  (5.30) 

where  Pf  =  s/(i  -  t,y).  It  can  be  shown  that  if  SU  ^0  or  6\  ^  0,  then  the  complex 
terms  Pq  and  Pr  constructively  interfere  with  each  other  to  produce  an  increase  in  filter 
output  near  the  texture  boundary,  thus  forming  a  ridge  signature.  The  height  of  the 
ridge  depends  on  the  texture  phase  shift  ip.  The  details  leading  to  this  result  follow. 

Since  s/  is  now  the  result  of  convolving  a  gate  function  with  a  GEF,  we  have  (cf. 
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(5.19),  (5.20)) 


sj{x,y) 


Sfix  -  r,y) 


rs/i  /•r/2 

/  /  h{x  -  a,y  -  0)da 

J-a/2  J-rj2 


i/2  J-r/2 
s/2  /•3r/2 


dl3 


/  / 

J-s/2  Jr/2 


h{x  —  a,y  —  /3)dadl3 


(5.31) 

(5.32) 


h  is  the  GEF  in  (3.2)  and  can  be  represented  by  the  sum  of  its  real  and  imaginary  parts: 


h{x,y)  =  hr{x,y)->r  hi{x,y)  (5.33) 

where 

hrix,y)  =  g{x',y')cos[27r{6Ux  +  6Vy)]  (5.34) 

hi{x,y)  =  -jff(x',  j/')sin[2rr(^f/x  +  ^V'y)]  (5.35) 

Again  consider  m  over  three  image  regions.  The  first  region  is  defined  by  |i|  <  r/2 
(points  well  inside  region  1).  From  (5.31)  and  (5.33), 

fs/2  /■r/2 

Sf{x,y)=  /  /  [hrix  -  a,y  -  0)  +  hi{x  -  a,y  - /3)]dad(3  (5.36) 

J-s/2  J-t/2 

For  large  r,  the  integral  of  hr  approaches  some  constant  71,  and  the  integral  of  A,  (since 
it  is  an  odd  function)  approaches  zero.  Also  for  large  r,  Sf{x  —  r,y)  approaches  zero. 
Thus  sj{x,y)  ss  71,  s/(i  -  r,y)  ss  0,  and  by  (5.30)  m(z,y)  «  7iC.  A  similar  argument 
holds  for  points  in  region  2  (r/2  <  i). 

The  third  region  is  the  transition  near  x  =  r/2.  At  x  =  r/2,  (5.36)  can  be 


rewritten  as 


/■j/2  rr 

y{rl2,y)=  /  /  (Ar( 

J-s/2  Jo 


0‘,y-  0)  +  hi(a,  y  -  P)]  dadfi 


Again  assume  r  to  be  large.  In  this  case,  the  imaginary  component  does  not  go  to  zero. 
Since  the  integral  of  the  real  component  equals  71,  we  can  write 


7i  =  /  /  hr(a,y  -  0)dad/3  (5.37) 

J-$/2  Jo 


For  the  imaginary  component,  let 


fa/2  fr 

12=  I  /  hi{a,  y  -  ())dadl3 
J-a/2  Jo 


(5.38) 


Similarly  for  Sf{x  -  r), 

Sf{-rl2,y)=  I  I  [hr{a,y- (3)-^hi{a,y~  I3)]dad0 

J —a/2  J—T 


Therefore,  at  z  =  r/2, 


=  7i  +  J72 


Sf{x  -  r,y)  =  7i  -  j-f 2 


and  (5.30)  reduces  to 


m(r/2,y)  =  C ^27?  +  27^  +  (7?  -  I2  +  j'^lil2)e^'‘'  +  (7?  -  I2  -  3‘^l\l2)^~^'^ 


75 


=  2')l  +  2(7?  -  72  )  cos  0  -  47172  sin  0 


(5.39) 


Note  that  when  V*  =  0  or  a  multiple  of  27r  (i.e.,  no  texture  phase  shift),  m  =  71C  as  it 
does  at  points  far  removed  from  the  transition.  For  a  ridge  to  occur,  m  at  the  transition 
I  =  r/2  must  be  greater  than  7iC: 


7iC  <  m(r/2,y) 

<  C^27i  +  20^7^  +  2(7i  —  a^7i )  cos  V’  —  4a7^  sin  V’ 

<  7iC^(a,^)  (5.40) 

where  we  let  72  be  represented  as 

72  =  071  (5.41) 


for  some  constant  q,  and 


2(0,  V')  =  yj'i  +  2a2  4-  2(1  —  q2)cos0  —  4a  sin  V’ 


(5.42) 


where  it  can  be  shown  that  2  >  0.  Thus,  summarizing: 


m(a;,-) 


7iC,  I  <  r/2 

*  7iC2(q,^),  X  near  r/2  (texture  boundary) 
7iC,  X  >  r/2 

\ 


and  a  ridge  occurs  when  2  >  1.  Note  that  (5.42)  is  not  restricted  to  determining  the 
ridge  height.  For  2  <  1,  a  valley  occurs,  and  (5.42)  can  be  used  to  approximate  its  depth. 
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Table  5.1.  z  as  a  function  of  a  with  ^max  (in  degrees)  chosen  to  maximize  z. 


a 

V’max 

z 

0.1 

-11 

1.005 

0.2 

-23 

1.020 

0.3 

-33 

1.044 

0.4 

-44 

1.077 

1.5 

-113 

1.803 

2.8 

-141 

2.973 

Setting  dzjdil)  =  0  gives  the  value  of  rj}  that  maximizes  z: 


V’max  =  tan"‘ 


•  -2a  • 
.1  — 


Table  5.1  gives  z  versus  a  for  V’max  (in  degrees). 

Observe  that  as  o  — »■  0,  2  — *  1  from  above,  and  V’max  — 0.  Thus,  as  long  as 
72  >  0,  a  ridge  can  occur  if  0  is  sufficiently  close  to  V’max-  f  rom  (5.38),  it  is  clear  that 
72  >  0  whenever  the  center  frequency  {SU,6V)  of  hi  is  nonzero  and  finite;  i.e.,  when  the 
Gabor-fUter  center  frequency  is  not  an  harmonic  of  i. 

As  we  have  just  seen,  if  an  improperly  tuned  Gabor  filter  is  applied  to  an  image 
exhibiting  a  texture-phase  difference,  a  ridge  signature  can  occur  at  the  texture  bound¬ 
ary.  While  generating  a  ridge  is  not  the  ideal  result,  it  can  still  be  useful  for  texture 
segmentation.  To  be  able  to  perform  texture  segmentation,  though,  the  ridge  must  be 
reasonably  strong.  The  ridge’s  strength  depends  upon  its  height,  which  depends  on  tp, 
the  texture-phase  shift,  and  on  72  in  (5.38).  Appendix  C  gives  a  method  for  computing 
ridge  height. 
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5.2.3  Nonuniform  Textures 

Unlike  uniform  textures,  nonuniform  textures  consist  of  texels  that  can  undergo 
perturbations  in  position,  orientation,  and  shape.  These  perturbations  make  analysis 
more  difficult.  Although  Clark  and  Bovik  have  investigated  the  effect  of  texel-position 
variability  [68],  they  did  not  consider  perturbations  in  texel  orientation  and  shape.  An¬ 
alyzing  the  effects  of  texel  orientation  and  shape  perturbations  is  much  more  difficult, 
especially  in  the  general  case,  because  the  results  depend  strongly  on  individual  texel 
characteristics.  Thus,  this  thesis  does  not  provide  a  quantitative  analysis  for  nonuniform 
textures,  but  instead  presents  qualitative  arguments  and  experimental  results  (Chap¬ 
ter  8).  Chapter  7  gives  more  discussion  on  nonuniformly  textured  images. 

The  experimental  results  indicate  that  the  signatures  obtained  for  uniformly  tex¬ 
tured  images  also  can  occur  for  nonuniformly  textured  images.  In  contrast  to  uniform 
textures,  though,  a  nonuniformly  textured  image  cannot  be  represented  in  the  frequency 
domain  simply  as  a  2-D  impulse  train.  The  perturbations  possible  in  the  texels  intro¬ 
duces  more  frequency  components  beyond  just  simple  harmonics.  The  net  result  is  that 
the  Gabor-filter  output  signatures  obtained  for  nonuniformly  textured  images  typically 
exhibit  many  local  output  variations.  (See,  for  example.  Fig.  5.2.)  In  spite  of  these  out¬ 
put  variations,  the  results  of  Chapter  8  demonstrate  that  Gabor-filter  outputs  derived 
from  nonuniformly  textured  images  can  be  useful  for  texture  segmentation. 

In  addition  to  the  three  signature  types  derived  for  uniform  textures,  nonuniformly 
textured  images  can  exhibit  a  fourth  signature  type.  Consider,  for  example,  two  regions 
that  have  the  same  average  spatial-frequency  content,  but  the  spatial-frequency  variation 
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Fig.  5.2.  Example  Gabor-filter  output  from  a  nonuniformly  textured  image. 
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is  different  between  regions.  In  this  case,  the  local  variation  in  the  Gabor-filter  output 
m  will  differ  in  the  two  regions.  This  fourth  form  of  discontinuity  will  be  referred  to  as  a 
step  change  in  the  average  local  output  variation  of  m.  Fig.  1.4  gives  an  example  of  this 
type  of  signature. 

With  a  step,  valley,  or  ridge  discontinuity,  standard  image-segmentation  tech¬ 
niques  con’.d  be  used  to  locate  the  discontinuity.  This  is  not  the  case,  however,  with  a 
change  in  average  local  output  variation  without  some  further  filtering.  One  possible 
solution  is  to  transform  this  quantity  into  a  change  in  mean  value.  Turner  [60]  encoun¬ 
tered  this  problem  and  suggested  using  a  bandpass  filter  for  detecting  such  local  changes. 
For  our  situation,  this  would  involve  applying  a  second  Gabor  filter  to  the  output  m  of 
the  first.  If  the  variations  within  two  regions  have  similar  frequency  content,  then  the 
second  filter  output  would  be  proportional  to  the  magnitude  of  the  variation.  Thus,  a 
difference  in  average  output  variation  would  translate  into  a  difference  in  mean  for  the 
second  filter’s  output. 

Another  simple  method  for  transforming  a  difference  in  average  local  output  vari¬ 
ation  into  a  difference  in  mean  is  to  perform  the  foUowing  operation  on  m: 

LPF{|m(i,  y)  -  (5.43) 

Here  pm  is  the  mean  value  of  m  (DC  component)  and  LPF  is  a  low  pass  filter.  This 
method  does  not  make  any  assumptions  on  the  frequency  content  of  the  input.  Other 
methods  are  possible. 

Chapter  7  examines  nonuniformly  textured  images  in  depth  and  gives  a  numerical 
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technique  for  measuring  the  discriminability  of  arbitrary  pairs  of  textures.  This  mea¬ 
sure  of  discriminability,  then,  leads  to  a  method  for  selecting  appropriate  Gabor-filter 
parameters  for  discriminating  any  given  texture  pair. 


Chapter  6 


Gabor-Filter  Parameter  Guidelines 


Section  5.2  showed  that  when  a  properly  tuned  Gabor  filter  is  applied  to  a  textured 
image,  distinct  output  signatures  arise  at  the  texture  boundaries.  This  chapter  describes 
how  to  select  filter  parameters  for  a  properly  tuned  Gabor  filter.  Section  6.1  provides 
parameter  guidelines  based  on  previous  analyses.  Then  Section  6.2  discusses  parameter 
constraints  that  arise  when  a  bank  of  filters  is  to  be  designed.  For  nonuniform  and 
natural  textures,  however,  the  guidelines  provided  by  analyses  are  only  approximately 
correct;  so  in  Chapter  7,  an  algorithm  is  presented  for  numerically  determining  filter 
parameters  for  an  arbitrary  texture  pair. 

6.1  Guidelines  for  Selecting  Filter  Parameters 

B<ised  on  the  analyses  of  Chapters  4  and  o.  this  section  provides  guidelines  for 
selecting  the  parameters  for  a  properly  tuned  Gabor  filter.  Chapter  8  provides  design 
examples  and  image-segmentation  results.  This  section  assumes  that  the  given  image 
contains  two  uniformly  textured  regions,  whose  constituent  texels  are  tj  and  <2  (as  defined 
in  Section  5.1)  and  whose  te.xel  spacings  are  ( Ayi )  and  ( Aa;2'  ^J/2)-  For  convenience. 
Table  6.1  summarizes  the  parameter  selection  criteria. 
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Table  6.1.  Gabor-filter  design  criteria  for  processing  a  textured  image  con¬ 
taining  two  textured  regions,  Aii,  Aj/i)  and  iZ2(^2i  Aj/j)* 

A)  Uniform  Textures,  constituent  texels  ti  and  <2  differ  and/or  texel  spacings  (Ari,  Aj/i) 

and  (Aa:2.  Ay2)  differ  — ►  Gabor-filter  output:  Step  Signature. 

=  Spatial  extent  of  GEF’s  Gaussian  envelope. 

a)  Recommendation:  (T,  =  Ax,  <Ty  =  Ay,  where 
Ax  =  max(Aj:i,  A12);  Ay  =  max(Ayi,  Ay2). 

b)  If  Azi  ^  Az2  or  Ayi  ^  Ay2,  then  GEF  cannot  be  tuned  to  both  hamonics 
and  undershoot/overshoot  can  occur  in  Step  Signature. 

c)  large  relative  to  (Ai,  Ay)  — ►  output  signatures  cleaner. 

d)  (<rx,<^y)  small  relative  to  (Ax,  Ay)  — ►  texture  boundary  better  localized. 

X  =  (Tylffi  =  aspect  ratio  of  Gaussian  envelope. 

(U,  V)  =  GEF  center  frequency. 

a)  Recommendation:  U  =  2irk/Ax,  V  =  2irl/Ay,  where 

'k,i  =  arg{maxi  V)  -  T^iU,  V)|},  k,i  integers. 

b)  Depends  on  texel  spacing  (Ax,  Ay)  and  differences  between  harmonics 

c)  Tune  center  frequency  to  harmonic  exhibiting  maximum  difference. 

d)  If  not  tuned  precisely  to  an  harmonic,  undershoot/overshoot  could  result 
in  output  signature. 

d  =  Orientation  of  Gaussi2m  Envelope. 

a)  Recommendation:  Select  orientation  along  direction  of  texel-spacing  lattice. 

b)  Independent  of  (<T*,<ry,f/,  V). 

B)  Uniform  Textures,  Texture-Phase  Difference  (texels  identical  in  two  regions  (fi  =  <2)). 
te.xel  spacing  same  in  two  regions  Aii  =  Ax2,  Ayi  =  Ay^,  and  regions  shifted  relative 
to  each  other  — ►  Gabor-filter  output:  Valley  or  Ridge  Signature. 

Design  criteria  as  in  Part  A. 

X  —  (Ty/ffi 

{U,  V)  a)  Recommendation:  Select  {U,  V)  equal  to  an  harmonic 

(2xjt/Ax,27r//Ay)  as  in  Part  A  — ►  produces  ro/Iey  signature, 
b)  If  {U,V)  ^  an  harmonic  — ►  nonideal,  but  still  discriminating, 
ndge  signature  results. 

0  Design  criteria  as  in  Part  A. 

C)  Nonuniform  Textures:  Texels  (<i,  <2)  differ  and  texels  perturbed  in  position,  orientation, 
and  shape  — ►  Gabor-filter  output:  Step,  Valley,  or  Ridge  Signature  with  output  variation  or 
Difference  in  Local  Output  Variation  Signature. 

All  filter  parameters  -  Use  guidelines  in  part  A. 

-  Use  techniques  of  Chapter  7  for  parameter  selection. 
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6.1.1  Texels  in  Two  Textured  Regions  Differ 

Assume  that  the  texels  ti  and  <2  differ.  Thus,  the  discussion  focuses  on  the  design 
of  a  Gabor  filter  tuned  for  producing  a  step-signature  output.  Also,  assume  for  the  time 
being  that  the  texel  spacings  for  the  two  textures  are  identical;  i.e.,  Aii  =  A12  =  Ax, 
Aj/i  =  Ay2  =  ^y-  This  condition  will  later  be  relaxed. 

The  parameters  to  select  for  the  Gabor  filter  are  {ax,  ay),  A,  {U,  V),  and  0.  Fig.  6.1 
illustrates  the  relationship  between  filter  size  and  texel  spacing.  The  ellipse  represents 
the  one-standard-deviation  contour  of  the  Gaussian  envelope  of  a  Gabor  filter;  i.e.,  {i,  y}, 
such  that  {xjaxY  +  {.vl^yY  —  T  The  positioning  of  the  ellipse  at  point  (x,  y)  represents 
the  position  of  the  GEF  when  the  Gabor-filter  output  m  is  computed  at  point  (x,j/). 

The  choice  of  ax  and  ay  is  a  tradeoff  between  Gabor-filter  output  variation  and 
accurate  boundary  localization.  When  a^  >  Ax  and  ay  >  Ay,  the  filter  envelope  encom¬ 
passes  multiple  texels,  regardless  of  its  position  in  the  image.  Although  the  positions  of 
the  texels  vary  within  the  envelope  as  the  filter  progresses  across  the  image,  the  Gabor- 
filter  output  m  remains  approximately  constant  over  a  region.  If  <7x  C  Ax  or  ay  <  Ay, 
the  filter  output  depends  on  whether  or  not  a  texel  occurs  within  the  GEF  envelope. 
This  results  in  periodic  Gabor-filter  output  variations  throughout  a  region.  To  avoid 
significant  output  variation,  ct^/Ax  and  <7j,/Ay  should  both  be  large;  i.e.,  the  GEF’s 
spatial  extent  should  cover  a  number  of  texels.  If  {ax,c^y)  are  large,  though,  near  the 
texture  boundary,  the  filter  envelope  will  extend  into  both  regions.  This  region  overlap  is 
what  produces  the  sigmoid  output  transition  described  in  Section  5.2.1.  As  {ax, ay)  be¬ 
come  larger,  the  transition  becomes  more  gradual,  making  it  more  difficult  to  locate  the 
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Fig.  6.1.  Schematic  representation  of  the  application  of  a  Gabor  filter  to 
a  uniformly  textured  image.  Image  consists  of  two  adjacent  regions,  each 
containing  nine  texels.  The  ellipse  represents  the  application  of  a  GEF  at 
one  point  in  the  convolution  (3.1). 
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texture  boundary.  Experimental  evidence  suggests  that  filter  performance  is  relatively 
insensitive  to  these  ratios.  A  good  compromise  is  to  set  them  to  unity;  i.e.,  cTx  =  Ax, 
Oy  =  Ay. 

If  the  texel  spacings  in  the  x  and  y  directions  differ  (Ax  ^  Ay)  [but  the  two 
textures  still  use  the  same  spacing!],  then  using  the  aforementioned  design  criteria,  cr^  ^ 
ay.  Thus  the  filter’s  aspect  ratio  A  =  <7y/<Ti  ^  1,  resulting  in  an  asymmetric  filter.  For 
asymmetric  filters,  the  orientation  0  of  the  Gaussian  in  the  GEF  (3.2)  becomes  an  issue. 
Based  on  the  discussion  above,  the  Gaussian  should  be  oriented  to  encompass  on  the 
average  as  many  texels  as  possible.  If  the  texels  are  spaced  over  a  rectangular  lattice,  the 
Gaussian  should  be  oriented  along  the  x  and  y  axes  (i.e.,  0  =  0  or  7r/2).  If  the  texels  are 
not  spaced  over  such  a  lattice  but  are  situated  relative  to  some  rotated  coordinate  system 
{x',y'),  then  the  Gaussian  should  be  oriented  along  the  rotated  axes.  The  orientation  of 
the  complex  sinusoid  <t>  is  determined  by  the  Gabor-filter  center  frequency  {U,V),  and 
thus  by  the  analysis  in  Section  5.2,  depends  on  the  spectral  differences  between  texels. 
Therefore,  the  choice  of  0  is  in  general  independent  of  (f). 

The  choice  of  center  frequency  (U,  V)  depends  on  the  texel  spacing  (which  deter¬ 
mines  the  harmonics)  and  on  the  spectral  differences  between  texels  at  the  harmonics. 
As  discussed  in  Section  5.2.1,  (U,V)  should  be  set  to  the  harmonic  that  differs  most  in 
power  between  the  texels  in  the  two  regions.  Although  two  texels  might  differ  more  at 
some  nonharmonic  frequency,  using  this  frequency  as  a  filter  center  frequency  in  general 
produces  an  output  signature  that  exhibits  overshoot  and/or  undershoot  -  such  signa¬ 
tures  have  lower  values  within  the  textures  than  the  values  produced  by  a  properly  tuned 
filter.  This  is  shown  in  Appendix  B  (compare  (Ai,  A2)  to  (Ai,A2)). 
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6.1.2  Texel  Spacings  Differ  between  Textured  Regions 

When  texel  spacing  is  the  same  in  both  regions,  each  texture  has  spectral  energy 
centered  about  the  same  harmonics  (cf.  (5.13)),  and  a  Gabor  filter  can  be  designed 
to  produce  step-signature  outputs.  If  the  texel  spacings  of  the  two  regions  differ,  the 
harmonics  from  the  different  textures  do  not  coincide.  Since  a  Gabor-filter  can  be  tuned 
to  only  one  harmonic,  signature  distortion  will  result.  In  analyzing  this  distortion,  note 
that  the  Gabor-filter  operation  (prior  to  computing  the  magnitude)  is  linear,  allowing 
the  study  of  each  region  independently.  Assume  that  the  Gabor-filter  center  frequency 
iU,V)  equals  an  harmonic  of  region  1.  Then,  the  analysis  proceeds  zis  for  the  step 
signature.  The  frequency  coordinates  for  the  nearest  corresponding  harmonic  of  region 
2  can  be  written  as  (f/  -t-  6U,V  +  6V),  where  {SU,6V)  is  the  frequency  offset  between 
the  harmonics  of  the  two  regions.  Thus,  the  analysis  of  region  2  becomes  analogous  to 
that  for  the  ridge  signature.  Combining  the  results  for  the  two  regions  and  computing 
the  magnitude  results  in 

=  !v(i,y)l  =  2irylPrP;  +  PqPS  +  PtPq  +  PqP;  (6.1) 

where  Pq  =  y),  Pr  =  -  r,  y),  and  Sf{x,  y)  and  Sf{x  -  r,  y) 

are  given  by  (5.19)  and  (5.32).  For  points  far  removed  from  the  texture  boundary,  an 
analysis  similar  to  that  in  Section  5.2.1  reveals  that  (6.1)  produces  a  step  signature. 
The  analysis  near  the  texture  boundary,  however,  suffers  from  the  same  complications 
encountered  in  evaluating  overshoot  and  undershoot  (Appendix  B).  Although  a  detailed 
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analysis  is  impractical,  the  presence  of  the  GEF  integral  in  (6.1)  suggests  that  over¬ 
shoot  and/or  undershoot  can  be  expected.  An  example  in  Chapter  8  corroborates  this 
observation. 

6.2  Parameter  Constraints  for  Filter  Banks 

Section  6.1  provided  guidelines  for  selecting  filter  parameters  for  individual  filters. 
These  guidelines,  however,  are  based  upon  specific  image  characteristics.  In  general, 
though,  these  characteristics  are  not  known  a  priori  (assuming  that  ultimately  we  are 
striving  for  a  truly  autonomous  texture-segmentation  system),  and  thus  it  is  difficult  to 
choose  appropriate  Gabor-filter  parameters.  Instead,  a  collection  of  such  filters  must  be 
specified  (i.e.,  a  filter  bank),  where  each  filter  is  tuned  to  a  different  frequency  band,  and 
collectively  they  span  the  range  of  frequencies  expected  in  the  input.  These  filters  are 
then  applied  to  the  image  (conceptually  in  parallel  [56]),  and  their  outputs  are  combined 
in  a  meaningful  way,  so  as  to  partition  the  image  into  regions  of  homogeneous  texture. 
Defining  a  filter  bank  involves  specifying  the  number  of  filters  within  the  filter  bank  and 
the  parameters  for  each  of  these  filters.  The  need  to  combine  filter  outputs  imposes 
certain  restrictions  on  these  filters.  These  restrictions  are  discussed  below. 

6.2.1  A  Constraint  on  a 

Assume  for  the  moment  that  aU  textured  images  to  be  encountered  have  the  same 
texel  spacing  and  that  only  symmetric  Gabor  filters  are  to  be  used  (i.e.,  Ox  =  Oy  =  (t). 
This  section  shows  that  if  Gabor-filter  outputs  are  to  be  compared,  the  corresponding 
Gabor  filters  must  have  equal  values  of  <r. 
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Let  us  assume  that  the  input  image  is  periodic  with  period  T  in  both  directions. 
(This  simplifies  the  frequency  analysis  without  affecting  the  gray.srale  distribution  in  the 
regions  of  interest.)  The  image,  then,  can  be  represented  by  its  complex  Fourier  series 

nss— oo  m=— oo 

where  u  =  2t: fT,  and  Cm.n  are  the  Fourier-series  coefficients.  Consider  the  GEF  h 
of  (3.5).  For  simplicity,  assume  that  h  is  oriented  along  the  x  axis.  Then  x'  =  x. 
Convolution  of  i  with  h  after  separating  the  integrals  yields 


i{x,y)  =  i(x,y)*  h{x,y)  =  ^  ^  °Uq 


where  J\  =  The  integral  Ji  equals 


/—  (wmtT 


Equation  (6.2)  shows  that  the  resulting  subimage  i  is  composed  of  the  original 
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2  2  2 

input  with  each  term  of  the  input  being  reduced  by  the  factor  e~  VtC""*)  +(0-u;n)  ] 
features  of  this  equation  should  be  emphasized.  First,  the  only  term  in  the  input  that 
is  left  unattenuated  is  the  complex  sinusoid  oriented  in  the  x  direction  (i.e.,  m  =  0), 
with  a  frequency  equal  to  the  center  frequency  of  the  filter  (i.e.,  nu;  =  fl).  Second, 
the  parameter  <t,  combined  with  the  frequency  of  the  harmonic,  controls  the  degree  of 
attenuation  of  the  harmonic.  This  suggests  that  it  is  not  feasible  to  make  comparisons 
between  filter  outputs,  if  is  not  the  same  for  both  filters. 

For  example,  consider  two  filters  with  the  same  center  frequency,  but  different 
<t’s.  If  these  filters  are  applied  to  the  same  input,  the  filter  with  the  smaller  <t  will 
usually  produce  a  larger  output.  Unless  the  frequency  distribution  of  the  input  is  known, 
however,  the  amount  of  difference  cannot  be  determined.  That  is,  filter  outputs  cannot 
be  normalized  without  knowing  the  frequency  distribution  of  the  input.  Thus  all  filters 
within  a  given  filter  bank  must  have  the  same  value  of  cr. 

The  choice  of  a  depends  on  the  texel  spacing;  so  images  with  different  texel 
spacings  (e.g.,  images  at  different  scales),  require  filters  with  different  values  of  <7.  Since 
the  output  of  filters  with  different  tr’s  cannot  be  reliably  compared,  multiple  filter  banks 
must  be  used,  with  each  bank  consisting  of  a  collection  of  filters  with  the  same  cr.  The 
idea  is  to  partition  the  range  of  te.xel  spacings  into  k  intervals,  specify  the  (Tj^'s,  and 
define  a  filter  bank  for  each  <7*. 

6.2.2  Other  Parameter  Constraints 

Because  the  filters  within  a  filter  bank  should  span  the  expected  2-D  frequency 
range  of  the  input  images,  large  texel  spacings  can  present  a  problem.  As  the  texel 
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spacing  becomes  larger,  a  becomes  larger,  and  thus,  the  bandwidth  of  the  filters  becomes 
narrower.  This  means  that  more  filters  are  required  to  cover  the  same  frequency  band. 
Although  there  is  an  upper  bound  on  the  number  of  frequencies  in  the  band  (dictated  by 
the  number  of  image  pixels  sampled),  the  number  could  become  very  large.  This  problem 
can  be  circumvented  by  recalling  that  as  an  image  increases  in  scale,  its  frequency  content 
is  compressed;  i.e.,  g{ax)  |^G(u;/a).  Images  tend  to  have  most  of  their  energy  around 
DC,  with  energy  diminishing  rapidly  at  the  higher  frequencies.  This  means  that  there 
exists  a  cutoff  frequency  /c  above  which  their  energy  is  insignificant.  Since  increasing 
the  size  of  an  image  results  in  a  proportional  compression  of  frequency,  the  net  effect 
is  a  similar  reduction  in  cutoff  frequency.  Even  though  the  frequency  spacing  between 
filters  decreases  with  increasing  <7,  the  cutoff  frequency  decreases  proportionally.  Thus, 
the  number  of  filters  remains  constant.  Therefore,  a  fixed  number  of  frequencies  can  be 
assigned  to  each  bank  of  filters,  without  incurring  a  significant  loss  in  energy. 

Once  a  has  been  specified  (thus  defining  a  filter  bank  for  a  particular  interval  of 
texel  spacings),  the  frequency  parameters  (f2,<^)  must  be  specified  for  each  filter  within 
the  bank.  Since  the  frequency  content  of  an  image  is  not  typically  known  a  priori,  the 
filters  must  cover  the  entire  range  of  expected  frequencies;  however,  because  the  image 
is  sampled,  the  entire  range  of  frequency  harmonics  (in  both  the  u  and  v  directions) 
is  known.  Given  a  pair  of  frequencies  U  and  V ,  it  is  easy  to  compute  f2  and  4>:  = 

+  V2  and  4)  =  tan”^(V/f/).  With  this  information,  all  possible  values  of  fl  and  <t> 
can  be  determined.  If  the  number  of  samples  is  large,  however,  it  would  be  impractical 
to  have  a  filter  at  every  possible  frequency.  It  might  be  sufficient  to  have  significant 
overlap  between  adjacent  filters  to  cover  the  desired  frequency  range  and  360  degrees 
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of  orientation.  The  amount  of  frequency  and  orientation  overlap  is  determined  by  the 
center  frequency  spacing  and  the  bandwidth  of  the  filter,  which  is  controlled  by  a.  One 
possible  choice  for  filter  spacing  is  to  make  the  difference  between  center  frequencies  of 
adjacent  filters  equal  to  the  half-peak  bandwidth  of  one  of  the  filters.  A  similar  choice 
can  be  made  for  orientation  overlap,  and  thus  the  number  of  required  orientations  can  be 
determined.  Bovik  et  al.  [53]  derived  a  half-peak  orientation  bandwidth  for  the  GEFs. 
The  radian  bandwidth  Z  is  defined  as  Z  =  2tan“^[2a/(ft<7)]  where  a  =  ^(ln2)/2.  Thus, 
for  each  center  frequency  Cl,  we  define  a  set  of  filters,  each  with  a  different  orientation 
parameter  <t>,  and  each  4>  spaced  Z  radians  apart. 

One  popular  filter  configuration  that  is  consistent  with  these  constraints  is  the 
“rosette”  pattern  [63,  70,  73).  In  the  2-D  frequency  plane,  the  rosette  consists  of  overlap¬ 
ping  filters  whose  center  frequencies  lie  on  concentric  circles  centered  at  the  origin.  This 
configuration  spans  360  degrees  of  orientation  and  spans  frequencies  from  DC  upward 
to  any  desired  resolution.  One  formulation  that  directly  leads  to  this  pattern  are  Gabor 
wavelets  [62,  63].  An  example  of  such  a  pattern  is  shown  in  Fig.  6.2.  One  limitation  of 
the  rosette  pattern  is  that  it  does  not  allow  independent  selection  of  (ctj.,  ay)  and  {U,  V); 
however,  in  practice  this  is  not  a  problem.  Recall  that  {cFx,ay)  are  related  to  the  texel 
spacing.  Since  texel  spacing  changes  with  image  scale  and  frequency  content  is  propor¬ 
tional  to  scale,  (<Tx,<Ty)  and  {U,V)  are  related.  Thus  independent  selection  of  center 
frequency  and  filter  size  might  not  be  necessary.  Section  6.1.1  showed  that  asymmetric 
filters  can  be  beneficial  when  the  texel  spacings  differ  in  x  and  y.  The  rosette  pattern, 
however,  does  not  allow  for  varying  filter  asymmetry.  Often  in  practice,  though,  smooth 
signatures  can  still  be  attained  at  some  cost  in  boundary  localization  if  the  texel-spacing 
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Fig.  6.2.  Example  of  a  ’^rosette”  pattern  of  bandpass  filters  (from  Porat  and 
Zeevi  [73]). 
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difference  is  disregarded.  Thus,  it  appears  that  the  rosette  pattern  is  a  plausible  filter- 
bank  configuration.  Determining  the  number  of  filter  banks,  the  number  of  filters  in  each 
filter  bank,  and  the  filter  spacing  in  the  2-D  frequency  plane  are  topics  for  future  work. 


Chapter  7 


Determining  Filter  Parameters  for  Nonuniform  Textures 


The  parameter  guidelines  developed  in  Section  6.1,  are  only  approximately  correct 
for  nonuniform  and  natural  textures.  Since  (Tr,  (Ty,  and  9,  depend  primarily  on  texel 
organization,  the  guidelines  for  these  parameters  are  still  applicable.  The  frequency 
parameters  {U,V),  however,  are  no  longer  simply  related  to  the  difference  in  the  texel 
Fourier  transforms.  Thus,  the  methods  of  Section  6.1  are  inadequate  for  determining 
(f/,  V).  Previous  efforts  in  determining  Gabor-filter  frequency  parameters  have  involved: 
( 1 )  computing  the  Fourier  transforms  of  the  textures  of  interest  and  determining  the  most 
discriminating  frequency  [53],  (2)  using  heuristics  gleaned  from  studies  of  the  human 
visual  system  [56,  60,  74],  (3)  performing  a  spectral  decomposition  on  prototype  texture 
elements  for  each  texture  of  interest  and  noting  where  large  differences  occur  [15,  55], 
and  (4)  ad  hoc  selection  [62,  69].  As  Section  7.3  later  points  out,  these  methods  all  have 
limitations. 

This  section  develops  an  algorithm  for  determining  the  Gabor-filter  frequency 
parameters  for  any  given  texture  pair.  For  convenience,  the  algorithm  will  be  referred 
to  as  GFFS  (for  Gabor-Filter  Frequency  Selection).  Given  instances  of  a  texture  pair 
of  interest,  GFFS  searches  the  space  of  Gabor-filter  center  frequencies  to  determine 
the  Gabor  filter  that  provides  the  “greatest”  discrimination  between  the  two  textures. 
Note  that  this  is  a  supervised  approach  to  frequency  selection.  The  method  provides  an 
analytical  tool  for  evaluating  the  segmentability  of  texture  pairs  using  a  single  filter.  The 
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remainder  of  the  section  elaborates  on  GFFS  and  compares  it  to  previously  proposed 
techniques  for  determining  Gabor  filter  center  frequencies.  Chapter  8  shows  experimental 
results  demonstrating  the  efficacy  of  the  new  technique. 

7.1  Overview  of  the  GFFS  Algorithm 

As  mentioned  previously,  the  application  of  a  Gabor  filter  to  a  textured  image 
i(x,y)  can  produce  an  output  image  m{x,y)  exhibiting  some  type  of  discontinuity  at 
the  texture  boundaries  (called  signatures).  This  output  then  can  be  used  to  segment 
the  image.  The  problem  is  to  find  Gabor-filter  parameters  that  will  produce  one  of  these 
discontinuities  at  the  texture  boundary. 

Depending  on  the  texture  pair  and  the  filter  parameters,  different  signature  types 
can  occur.  The  most  common  of  these  is  the  step  signature  (i.e.,  a  step  change  in 
filter  output  m{x,y)).  For  the  next  several  sections,  it  will  be  assumed  that  we  wish 
to  design  a  Gabor  filter  that  produces  the  “best”  step  signature  at  a  texture  boundary. 
Section  7.4  elaborates  on  how  the  method  can  be  extended  to  other  signature  types.  The 
method  for  determining  Gabor-filter  parameters  for  producing  a  step  signature  will  now 
be  described. 

The  problem  statement  is  the  following.  Given  a  textured  image  consisting  of 
known  textured  regions  A  and  B,  find  the  Gabor  filter  giving  the  largest  step  change  at 
the  texture  boundary.  This  Gabor  filter  is  determined  by  the  parameters  Oy,  U,  V, 
and  6  per  (3.4),  and  must  be  found  from  among  the  space  of  all  possible  Gabor  filters. 

Chapter  6  provided  guidelines  for  selecting  (7r,  Oy,  and  0.  Experience  indicates 
that  these  guidelines  are  also  effective  for  nonuniform  and  natural  textures.  This  allows 
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us  to  use  heuristic  methods  for  determining  these  parameters  (more  on  this  in  Sec¬ 
tion  7.2.3).  Thus,  the  method  reduces  to  determining  the  Gabor  filter  center  frequencies 
(f',  V').  This  is  accomplished  by  essentially  performing  an  exhaustive  search  over  all 
possible  frequencies. 

In  principle,  the  quality  of  a  step  signature  is  determined  by  the  amplitude  of  the 
step.  Previous  analyses  (Section  5.2)  and  experimental  results  (Chapter  8),  however, 
show  that  the  Gabor-filter  output  m{x,y)  resembles  an  ideal  step  only  in  special  cases. 
More  often,  due  to  the  inherent  random  structure  within  texture,  the  step  is  accompanied 
by  considerable  local  variation.  Thus,  directly  measuring  the  step  amplitude  is  infeasible. 
Instead,  stochastic  decision  theory  is  used  to  develop  an  alternative  measure  of  step- 
signature  quality. 

Developing  a  measure  of  step-signature  quality  begins  by  modeling  the  Gabor- 
filter  outputs  from  textured  regions  A  and  B  ais  independent  random  variables,  having 
pdf’s  Pa  and  pB  (Section  7.2.1).  Then,  for  each  (Ti,  Oy,  and  9  considered,  the  “best” 
Gabor-filter  center  frequencies  iU,V)  are  determined  as  follows: 

1.  Apply  a  windowed  Fourier  transform  (WFT)  to  a  random  set  of  points  within  each 
textured  region  A  and  B  -  this  effectively  gives  information  on  the  application  of 
a  family  of  Gabor  filters  to  each  of  the  random  points  (Section  7.2.2).  Each  filter 
in  the  family  has  a  different  center  frequency  {U,V).  Collectively  these  center 
frequencies  effectively  span  the  frequency  domain  of  the  textured  region.  Using 
the  computed  WFT  information,  estimate  parameters  for  the  pdf’s  pA  and  pB 
(Section  7.2.1). 
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2.  Using  the  estimated  pdf’s  for  textures  A  and  J5,  apply  a  likelihood-ratio  test  to 
compute  the  probability  of  correctly  determining  from  which  region  (either  A  or 
B)  a  Gabor-filter  output  value  arose.  This  probability  indicates  the  statistical 
difference  between  the  Gabor-filter  outputs  in  the  two  regions,  and  is  used  as  a 
measure  of  step-signature  quality. 

3.  The  center  frequency  {U,V)  producing  the  highest  quality  step  signature  is  deter¬ 
mined  the  “best”  and  used  to  design  h  in  (3.2). 

The  complete  algorithm  is  summarized  below.  For  each  (Jx,  cTy,  and  9  of  interest, 
do  the  following: 

1.  For  each  textured  region  A  and  B, 

a.  Form  a  randomly  selected  set  s  of  points  within  the  region. 

b.  For  each  point  (A',  F)  e  s,  compute 

IfOO  fOO 

/  /  i{x,y)g{x  -  X,y -Y)exp[-j{Ux -\-Vy)]dxdy 

(7.1) 

where  g  is  the  Gaussian  (3.3)  and  i  is  the  image.  F  is  the  windowed  Fourier 
transform  of  i  centered  at  (X,Y)  and  g  is  the  window  function.  The  com¬ 
putation  of  F,  which  is  implemented  as  an  iV  x  iV  DFT,  effectively  applies 
a  family  of  Gabor  filters  to  the  point  (A",  F),  where  the  center  frequencies 
(f/,  V)  of  the  filters  correspond  to  the  N  x  N  set  of  2-D  frequencies  given  by 


the  DFT. 


98 


c.  For  each  {U^V),  compute 


m^v) 


ZsFxAU,v) 

card(s) 

j:>iFx,Yiu,v)-m,v))^ 

card(s) 


(7.2) 


(7.3) 


where  fji{U,  V)  and  V)  are  the  sample  mean  and  sample  variance  for  the 
values  of  F  averaged  over  all  points  {X,Y)  considered  in  step  b  above. 

2.  For  each  {U,V),  compute  Pe{U,V),  the  total  probability  of  incorrectly  classifying 
textures  A  and  B,  per  (7.10).  Pe(U^  V)  gives  a  measure  of  step-signature  quality 
for  a  textured  image  (containing  textured  regions  A  and  B)  filtered  by  a  Gabor 
filter  having  parameters  (<Ti,ay,f/,  V,0). 

3.  The  values  of  ((/,  V)  corresponding  to  the  minimum  value  of  Pe(U^  V)  is  the  “best” 
Gabor-filter  center  frequency. 

After  applying  the  procedure  above,  one  “best”  center  frequency  {U,V)  is  obtained  for 
each  set  {(Tx,(Ty,0)  considered.  Two  options  are  now  available:  either  pick  the  “best” 
center  frequency  for  large  values  of  ((ri,<7y)  (as  describe  in  Section  7.2.3)  and  apply  the 
Gabor  filter  in  (3.1),  or  pick  the  “best”  center  frequency  for  small  values  of  (o-^,«Ty)  and 
apply  the  modified  Gabor  filter  ((3.1)  followed  by  (7.15))  as  discussed  in  Section  7.2.4. 

Section  7.2  provides  justification  and  comments  on  the  various  aspects  of  the 
GFFS  algorithm.  In  particular,  it  (1)  derives  a  measure  of  step-signature  quality  based 
on  stochastic  signal  detection  theory,  (2)  shows  how  the  simultaneous  application  of 
a  family  of  Gabor  filters  can  be  implemented  efficiently  using  the  windowed  Fourier 
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transform,  and  (3)  discusses  the  selection  of  <Ti  and  ffy. 

7.2  Algorithm  Implementation  Issues 

7.2.1  Measuring  Step-Signature  Quality 

Before  Gabor-filter  parameters  can  be  evaluated,  some  measure  of  step-signature 
quality  needs  to  be  established.  Since  the  location  of  the  step  transition  presumably 
corresponds  to  the  texture  boundary,  basing  signature  quality  on  accurate  step-edge 
detection  and  localization  is  attractive.  In  Canny’s  development  of  an  ideal  step-edge 
detector,  he  shows  that  both  the  detection  and  localization  of  the  step  improves  directly 
as  AItiq  increases,  where  A  is  the  step  amplitude  and  uq  is  the  average  noise  ampbtude 
[75].  For  the  GFFS  algorithm,  A  is  the  mean  difference  in  Gabor-filter  output  between 
regions,  and  tiq  corresponds  to  the  local  fluctuations  in  filter  output  within  a  region.  If 
i4/no  is  defined  as  the  signal-to- noise  ratio  (S/N)  of  the  Gabor-filter  output  m,  then  the 
S/N  seems  to  be  a  reasonable  basis  for  signature  quality. 

A  measure  of  step-signature  quality  based  on  the  S/N  can  be  derived  by  viewing 
step  detection  as  a  stochastic  signal  detection  problem,  where  the  goal  is  to  minimize 
the  error  in  erroneously  classifying  one  signal  (textured  region)  as  another.  Within  this 
framework,  the  Gabor-filter  output  within  a  given  textured  region  is  considered  to  be 
a  random  variable.  Although  the  distribution  of  this  random  variable  is,  in  general, 
unknown,  it  will  be  assumed  that  its  distribution  can  be  approximated  by  a  Gaussian 
over  the  range  of  probabilities  of  interest.  (Clearly  it  is  not  strictly  Gaussian,  since  the 
Gabor-filter  output  is  never  negative.  Possibly,  the  Rayleigh- Rice  distribution  would  be 
more  appropriate,  but  then  the  analysis  becomes  more  complex  with  little  impact.) 
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Given  two  textured  regions  A  and  B  and  a  Gabor  filter  Gj  (3.1),  let  the  output 
of  Gy  be  represented  by  the  probability  density  function  pa  when  Gy  is  applied  to  A 
and  by  pB  when  Gy  is  applied  to  B.  Consider  the  following  experiment:  apply  Gy  to 
a  textured  region  (either  A  or  B)  and  record  the  output  m,  at  some  random  position 
(i,j/),  in  the  random  variable  z.  The  problem  is  to  decide  whether  the  random  sample 
was  taken  from  region  A  (hypothesis  Hq)  or  from  region  B  (hypothesis  Hi).  Define  a 
decision  point  d,  such  that  if  2:  <  d,  then  the  sample  is  presumed  to  be  from  region  A 
(accept  hypothesis  Hq).  Otherwise,  it  is  presumed  to  be  from  B  (accept  hypothesis  Hi). 
For  this  experiment  there  are  two  possible  errors;  accepting  Hi  when  Hq  is  true  (Type 
I  error),  or  accepting  Hq  when  Hi  is  true  (Type  II  error).  The  goal  is  to  minimize  the 
sum  of  these  two  error  probabilities. 

The  solution  to  this  problem  is  well  known  (e.g.,  see  [76]),  and  reduces  to  finding 
the  decision  point  d  such  that  the  likelihood  ratio  A(2)  =  Pb{^)Ipa{z)  satisfies 

A(d)  =  Po/(l  -  Pq)  (7.4) 

where  Pq  is  the  prior  probability  that  the  region  is  A.  If  we  assume  that  the  two  regions 
have  the  same  area  and  are  equally  likely  to  occur,  then  Pq  =  1  -  Po  =  1/2,  and  A(d)  =  1. 
Thus  the  problem  reduces  to  finding  d  such  that  P/i(d)  =  PB(d). 

Let  the  normal  distribution  functions  pa  and  pg  have  parameters  {pai^^a)  and 
(pB'O’b)  respectively,  and  without  loss  in  generality,  assume  pa  <  PB-  Then,  equating 
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PA{d)  to  pB{d)  gives 


(Ta\/2x 


exp 


1 

\-{d-pB)^] 

to 

1 _ 

—  exp 

OBW  2t( 

2a% 

(7.5) 


Solving  for  d  produces 


j  _  {paOb- t^Bcr\)±<yAaBZ 

d  —  o  o 


- 


where 


Z  =  yJ{pB  -  Pa)^  +  2(ct|  -  a\)\w{aBloA) 


(7.6) 


(7.7) 


and  d  is  chosen  such  that  pa  ^  d  <  /xjg. 

The  error  probabilities  can  then  be  computed  as 


Pi 


PlI 


1 

r-P 

-id -PA? 

J 

1 

f  exp 
—00 

’-{d-pB?' 

(7.8) 

(7.9) 


where  P[  and  Pij  are  the  Type  I  and  Type  II  error  probabilities.  Then  the  total  error 
probability  Pe  becomes  Pe  =  Pi  +  Pll-  As  Pe  becomes  small,  the  probability  of 
mistaking  one  region  for  the  other  becomes  small.  Thus  Pe  is  a  reasonable  indicator 
of  step-signature  quality.  Per  step  2  of  the  GFFS  algorithm,  Pe  is  computed  for  each 
Gabor-filter  center  frequency  {U,V)  of  interest.  Thus  Pe  depends  on  {U,V)\  i.e.. 


PE{U,V)  =  Pi{U,V)-\.Pn{U,V) 


(7.10) 


If  the  parameters  of  pa  and  ps  are  known,  it  is  a  simple  matter  to  compute  Pe- 
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In  practice,  however,  only  estimates  of  these  parameters  are  available.  Estimates  of 
can  be  obtained  from  the  sample  means  estimates  of  (<7^, <7^) 

can  be  obtained  from  the  sample  variances  The  sample  mean  and  the  sample 

variance  are  both  unbiased  maximum  likelihood  estimators.  Thus,  as  the  number  of 
available  samples  approaches  infinity,  the  error  in  estimating  Pe  using  the  sample  means 
and  variances  approaches  zero. 

7.2.2  Gabor-Filter  Application  via  Windowed  Fourier  Transforms 

Step  l.b  of  the  GFFS  algorithm  requires  the  application  of  multiple  Gabor  filters 
(one  filter  for  each  2-D  frequency  to  be  tested)  to  the  randomly  selected  points  within 
each  of  the  two  textured  regions.  An  efficient  method  for  performing  this  operation  is 
based  on  the  windowed  Fourier  transform  [77]. 

The  windowed  Fourier  transform  is  similar  to  the  classic  Fourier  transform  except 
that  the  input  is  premultiplied  by  a  window  function.  To  compute  the  windowed  Fourier 
transform  F,  the  following  equation  is  evaluated: 

fx,Y{U,V)  =  j  j  i{x,y)v}{x- X,y-Y)^x.Y{-j{Ux-\-Vy)\dxdy  (7.11) 

(all  integrals  range  from  — oo  to  oo  unless  otherwise  stated).  Here,  w  is  the  window 
function,  t  is  the  image  to  be  transformed,  and  F  is  a  function  of  frequency  (17,  V),  and 
window  position  (X,y). 

The  parallel  application  of  a  family  of  Gabor  filters  to  an  image  at  a  point  is  a 
special  case  of  applying  a  windowed  Fourier  transform  at  the  point  in  the  image  [78,  79]. 
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To  show  this,  let  p  be  the  result  of  convolving  an  image  i  with  a  GEF  h.  Then 

p(x,y)  =  k(x,y)*  i(x,y) 

=  j  j  i(a,l3)h{x  -  a,y  -  I3)dad0 

Consider  one  specific  point  in  the  convolution  (X,  Y).  Then 

p{X,Y)  =  J  J  iia,0)h{X -a,Y -p)dad0 

=  J  J  i(a,/3)gi(X  -  a)\(Y  -  /?)')expL?(l/(A'  -  a)  +  ViY  -  0))]dad0 

where  [(x-a)',  (p-/?)']  refer  to  rotated  spatial  coordinates  as  defined  in  Chapter  3.  After 
rearranging  terms  and  factoring  out  the  constant  complex  exponential  A'  =  exp[7(f7A'  + 
VY)],  we  have 

p(A,r)  =  A'  J  j  i{a,P)g{{X  -  ay,{Y  -  py)exp\-j{Ua-\-V^)]dadl3  (7.12) 

Defining  the  window  function  w  in  (7.11)  as  w{x,y)  =  g{-x\  —y')  gives 

p(A,K)  =  K  J  j  i{aj)w{a  ~X,I3-  Y)exp[-j{Ua  +  V/iydadp  (7.13) 

Observe  that,  except  for  the  constant  K,  equations  (7.11)  and  (7.13)  are  equivalent. 
Computing  the  complex  magnitude  of  (7.13)  eliminates  K  (which  represents  a  constant 
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phase  shift),  resulting  in 


lp(X,y)|=  J  J  iia,p)w{a-X,p-Y)exp[-j{Ua-\-V0)]dadl3 


(7.14) 


This  justifies  (7.1). 

The  previous  development  was  based  on  continuous  functions.  Thus,  X,  Y,  U,  V 
represent  continuous  variables.  These  arguments  can  be  easily  extended  to  the  discrete 
case,  where  X,  Y,  U,  V  take  on  discrete  values.  In  the  discrete  case,  the  windowed  Fourier 
transform  is  implemented  using  the  DFT.  Then,  {X,Y)  refer  to  image  pixels,  and  {U,V) 
refer  to  the  DFT  frequencies.  Thus,  if  an  image  is  multiplied  by  a  truncated  Gaussian 
centered  at  image  point  (X,Y),  and  the  DFT  magnitude  is  computed,  this  approximates 
the  application  of  a  family  of  Gabor  filters  to  the  image  at  the  point  (XyY),  where  each 
filter’s  center  frequency  corresponds  to  one  of  the  DFT  frequencies.  Thus,  computing 
a  single  DFT  is  equivalent  to  determining  the  output  from  a  family  of  Gabor  filters  at 
a  single  point,  where  the  center  frequencies  of  the  filters  span  the  frequency  domain 
of  the  image.  It  should  be  noted  that  a  Gabor  filter  could  be  designed  with  a  center 
frequency  other  than  one  of  the  DFT  frequencies.  Thus,  GFFS  does  not  apply  aU  possible 
Gabor  filters  to  an  image.  Later  in  this  section  (under  “other  issues”),  arguments  will 
be  presented  suggesting  that  these  omissions  are  not  significant. 


7.2.3  Specifying  <Tx  and  Oy 

This  section  examines  some  heuristics  for  specifying  <Tx  and  <Ty.  It  is  assumed 
that,  in  most  cases,  Ox  =  <Ty  =  <7;  thus,  the  parameter  0  is  immaterial.  Consider  the 
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formation  of  a  step  signature.  As  a  increases,  the  S/N  increases  due  to  a  reduction 
in  the  noise  component  (per  (7.6)  and  (7.10)).  This  occurs  for  two  reasons.  First,  as 
window  size  increases,  the  computed  value  of  the  windowed  Fourier  transform  (WFT) 
at  a  point  is  determined  by  a  larger  neighborhood  of  image  pixels.  This  causes  the 
WFT  output  to  be  less  sensitive  to  window  position  perturbations,  thus  reducing  output 
variation  (i.e.,  noise).  Secondly,  for  accurate  sampling,  the  possible  window  positions  are 
restricted  to  those  that  approximately  keep  the  window  within  the  region  boundaries  (the 
term  approximately  is  used  since  the  window  is  a  Gaussian  with  infinite  extent  and  will 
always  extend  beyond  the  region  bounds).  As  window  size  increases,  however,  the  extent 
of  possible  window  positions  decreases.  Since  the  WFT  output  now  varies  slowly  with 
position,  reducing  the  size  of  the  sampling  area  further  reduces  the  output  variation.  In 
the  limit  as  the  window  size  approaches  the  size  of  the  region,  the  variation  in  WFT 
output  goes  to  zero.  This  causes  the  S/N  to  approach  infinity,  which  suggests  that  a 
should  be  made  as  large  as  possible. 

The  fallacy  in  this  line  of  thinking  is  that  if  a  is  made  arbitrarily  large,  any  measure 
of  region  variability  is  lost.  It  must  be  remembered  that  in  a  real  texture-segmentation 
problem,  the  region  size  and  boundaries  are  unknown.  In  that  case,  if  a  is  too  large, 
the  window  can  significantly  overlap  regions,  thus  reducing  discriminability.  Therefore, 
the  choice  of  a  must  be  guided  by  practical  considerations.  Section  6.1  showed  that  the 
choice  of  (7  is  a  tradeoff  between  discriminability  and  boundary  localization,  and  that 
for  many  strongly-ordered  textures,  a  good  compromise  is  to  choose  a  approximately 
equal  to  the  texel  spacing.  As  we  will  soon  see,  however,  certain  textures  require  a  filter 
configuration  that  employs  a  much  smaller  a.  In  practice  the  GFFS  algorithm  described 
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in  Section  7.1  produces  similar  results  over  a  wide  range  of  cr’s.  Thus,  it  suffices  to  run 
the  algorithm  several  times  using  a  few  widely  spaced  values  of  a,  and  compan  results. 

7.2.4  Modified  Version  of  Gabor  Filter 

Fig.  7.1a  is  an  example  of  a  synthetic  texture  consisting  of  arrows  and  trian¬ 
gles.  Fig.  7.1c  shows  the  output  of  a  Gabor-filter  with  Cx  =  Oy  =  o,  and  a  equal  to 
the  texel  spacing.  As  can  be  seen  in  Fig.  7.1c.  the  step  signature  is  accompanied  by 
considerable  variation.  This  effect  is  typical  of  strongly-ordered  textures  whose  texels 
exhibit  pose  and/or  shape  perturbations.  The  problem  is  that  when  a  is  large,  the 
bandwidth  of  the  Gabor  filter  is  very  narrow,  and  thus  very  selective  in  frequency.  In 
this  case,  it  is  too  selective  in  frequency.  It  not  only  discriminates  between  the  two  dif¬ 
ferently  textured  regions,  but  it  also  detects  local  frequency  variations  within  a  region 
(caused  by  the  random  orientations  and  perturbations  of  the  texels).  One  possibility 
is  to  reduce  the  size  of  <t  so  that  the  Gabor  filter  will  adequately  discriminate  between 
textures  without  responding  to  within-texture  variations.  There  is,  however,  an  unde¬ 
sirable  side  effect  of  reducing  <t.  As  a  becomes  smaller,  the  spatial  resolution  of  the 
Gabor  filter  increases.  This  increase  in  spatial  resolution  causes  the  Gabor  filter  to  re¬ 
spond  to  local  spatial  variations  within  a  texture  (e.g..  the  periodic  placement  of  the 
texels).  This  effect  is  illustrated  in  Fig.  7. Id.  where  a  was  chosen  to  be  one  half  of  that 
in  Fig.  7.1c.  It  will  be  argued  (under  "other  issues")  that  this  texture  is  wide-sense  pe¬ 
riodic  [80].  If  this  is  true  and  the  period  corresponds  to  the  texel  spacing,  then  the  first 
two  moments  of  the  texture’s  graylevel  distribution  are  periodic  (i.e.,  p(r)  =  p(r  T) 


Fig.  7.1.  Contrasting  outputs  from  Cl  and  C2  configuration  filters: 

(a)  A  strongly -ordered  synthetic  texture  pair  consisting  of  arrows  and  triangles. 

(b)  Canny  ed^'e  detector  applied  to  (e)  and  superimposed  on  (a). 
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and  A'(ri,r2)  =  A'(ri  +  T, rj)  =  A'(ri,r2  +  T).  where  //  is  the  mean,  A'  is  the  autoco¬ 
variance  matrix,  and  the  r’s  are  2-D  position  vectors.  T  is  any  of  three  constant  2-D 
vectors  (xo,0)^,(0,  where  xo  and  yo  represent  the  texel  periods  in  x  and 

y).  Thus  it  is  not  unreasonable  to  expect  that  the  local  spatial-frequency  composition 
and,  hence,  the  Gabor-filter  output  will  also  be  approximately  periodic  (in  a  stochastic 
sense).  Thus  the  local  spatial  average  of  the  Gabor-filter  output  within  a  textured  region 
should  be  approximately  constant.  This  spatial  average  can  be  computed  by  applying  a 
Gaussian  to  the  Gabor-filter  output  as  shown  below. 

Tn'{x.y)  =  m{x.y)  ♦  (j'[.r.y)  (7.15) 

where  g'(x.y)  is  a  Gaussian  similar  to  (3.3).  The  result  of  applying  this  Gaussian  is 
shown  in  Fig.  7.1e.  Note  the  improvement  in  signature  quality. 

Although  the  GFFS  algorithm  cannot  directly  i)redict  the  quality  of  the  step  sig¬ 
nature  in  Fig.  7.1e  due  to  the  two-stage  process.  GFFS  can  still  be  employed  successfully. 
By  using  small  values  of  <t,  we  can  still  determine  the  best  step  signature  for  the  first 
stage.  Since  the  second  stage  is  simply  a  smoothing  operation,  the  resulting  output 
should  still  represent  the  highest  quality  step  signature  (for  that  particular  value  of  a). 
The  application  of  a  single  Gabor  filter  will  be  referred  to  as  a  Cl  configuration  and  the 
two-stage  configuration  (Gabor-filter  (3.1)  followed  by  a  Gaussian  smoothing  (7.15))  will 
be  called  a  C2  configuration. 

In  most  cases  <Ti  =  Oy  =  cr  is  di  reasonable  choice.  Chapter  5  showed,  however, 
that  for  strongly-ordered  textures,  when  the  te.xel  lattice  is  not  square,  an  asymmetric 
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filter  is  preferable.  In  that  case,  the  ratio  of  o-j  to  CTy  should  be  adjusted  to  match  the 
zispect  ratio  of  the  texel-spacing  lattice,  and  6  should  be  chosen  to  match  the  orientation 
of  the  lattice. 

7.2.5  Other  Issues 

This  section  presents  other  implementational  issues  for  the  GFFS  algorithm.  Al¬ 
though  theoretically  the  Gaussian  window  used  to  compute  the  windowed  Fourier  trans¬ 
form  has  infinite  spatial  extent,  in  practice  it  is  truncated  to  some  finite  window  size 
IF  (typically  6<7).  The  number  of  frequency  terms  computed  by  the  windowed  Fourier 
transform  is  determined  by  the  window  size  IF.  For  consistency  it  is  desirable  to  com¬ 
pute  the  same  number  of  frequency  components  for  all  window  sizes.  To  achieve  this, 
the  windowed  data  are  zero  padded  to  extend  it  to  the  size  of  the  full  image.  Although 
zero  padding  does  not  improve  frequency  resolution  (resolution  depends  on  the  size  of 
the  window),  it  does  increase  the  number  of  frequency  components  generated.  The  effect 
is  to  provide  interpolated  frequency  terms  [77].  A  positive  side  effect  of  this  increase  is 
that  the  number  of  Gabor-filter  center  frequencies  {U,V)  that  are  tested  is  increased. 

As  mentioned  earlier,  the  GFFS  algorithm  does  not  compare  all  possible  center 
frequen:ies  (a  formidable  task).  There  are  two  reasons  why  this  is  not  necessary.  First, 
Appendix  B  shows  that  for  strongly-ordered  textures,  the  greatest  difference  in  step 
height  occurs  when  the  Gabor  filter  is  tuned  to  a  multiple  of  the  reciprocal  of  the  texel 
spacing  (i.e.,  the  frequency  of  occurrence  of  the  texels).  These  frequencies,  among  others, 
are  examined  by  this  method.  Secondly,  experience  has  shown  that  Pe{U,V)  (and  thus 
signature  quality)  degrades  gracefully  with  changes  in  Gabor-filter  center  frequency.  By 
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comparing  Pe{U,V)  at  frequencies  adjacent  to  Mie  "best"  frequency,  it  can  be  verified 
that  that  Pe{U.,  V)  degrades  gracefully  in  each  case. 

The  random  selection  of  sample  points,  required  in  step  1  of  the  GFFS  algorithm, 
will  now  be  discussed.  In  many  cases,  te.xture  can  be  modeled  as  a  random  process. 
When  we  are  given  a  sample  of  a  textured  region,  we  are  sampling  only  one  instance  of 
this  process.  In  general,  this  is  insufficient  for  estimating  the  statistics  of  the  process. 
In  effect,  we  are  finding  the  “best”  filter  for  one  particular  instance  of  the  texture.  The 
same  filter  might  be  totally  ineffective  for  .some  other  instance.  Often,  the  underlying 
process  is  wide-sense  periodic  [80].  Then  the  process  statistics  are  unique  only  within 
a  fixed  period  (e.g.,  between  two  adjacent  texels).  Since  a  given  texture  instance  will 
typically  contain  many  periods,  sampling  a  single  instance  can  provide  a  representative 
sample.  Although  not  all  textures  can  be  considered  wide-sense  periodic,  it  is  probably  a 
reasonable  assumption  for  strongly-ordered  (e.g..  Fig.  2.6a  and  7.1a)  and  many  disordered 
textures  (e.g.,  Fig.  2.6c). 

The  number  of  points  required  for  a  representative  sample  depends  on  the  vari¬ 
ability  within  a  texture.  For  all  examples  used  in  this  study,  stable  results  were  achie^ed 
by  taking  200  samples  from  each  region.  In  general,  he  number  of  required  samples 
will  be  proportional  to  the  region  ^ize  (in  this  ca.se.  =s  0.1%  of  the  region  size).  Since 
the  region  size  is  u{N^)  and  the  time  complexity  of  the  DFT  in  0(A^log7V),  the  total 
time  complexity  of  the  algorithm  is  OlA'^^log  .V)  (where  N  is  the  row  or  column  dimen¬ 
sion  of  the  region).  In  practice,  run  times  are  in  the  neighborhood  of  2  hours  on  a  Sun 
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7.3  Previously  Proposed  Methods  for  Designing  Gabor  Filters 

Several  other  techniques  for  determining  Gabor-filter  parameters  have  previously 
been  suggested.  One  popular  technique  is  to  use  heuristics  based  on  neurophysiological 
and  psychophysical  studies  of  the  human  visual  system  (HVS)  to  design  a  set  of  filters 
[56,  60,  74].  (Note  that  Malik  and  Perona  [.56]  did  not  use  Gabor  filters.  They  did, 
however,  use  functions  that  are  similar.)  While  this  techniq  ..-  has  been  used  effectively 
to  test  prototype  texture-segmentation  schemes,  it  is  a  brute  force  approach  providing 
little  insight  into  the  relationship  between  algorithm  output  and  the  filter  characteristics 
that  produced  the  output.  Thus,  it  is  difficult  to  predict  how  these  schemes  will  perform 
over  a  wide  range  of  textures. 

A  method  suggested  by  the  works  of  Krose  [15]  and  Fogel  and  Sagi  [55]  involves 
comparing  the  spectral  composition  of  prototype  texels  from  the  regions  to  be  segmented. 
For  uniform  textures.  Chapter  5  show'ed  that  the  formation  of  a  step  signature  is  directly 
related  to  the  difference  in  frequency  content  between  texels.  This  method,  however,  has 
two  limitations.  First,  it  is  restricted  to  scrongly-ordered  te.xtures.  Second,  as  texel 
spacing  decreases,  the  texels  begin  to  interact  and  lose  their  individual  identity.  When 
this  occurs,  the  method  becomes  ineffective. 

Before  the  GFFS  algorithm,  the  most  effective  technique  for  determining  Gabor- 
filter  parameters  was  based  on  computing  the  DFT  of  each  textured  region  [53,  68]. 
The  2-D  frequency  component  that  differs  most  between  regions  is  then  selected.  This 
method  will  be  referred  to  as  the  DFT  method.  The  DFT  method  is  equivalent  to 
applying  a  windowed  Fourier  transform  to  a  region,  where  the  window  is  rectangular 
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and  equal  to  the  size  of  the  region.  Experimental  results  indicate  that  the  best  choice 
of  center  frequency  is  typically  insensitive  to  window  size.  Since  a  rectangular  window 
can  be  sized  to  approximate  the  spatial  extent  of  a  Gaussian,  the  results  of  the  DFT 
method  sometimes  predict  the  same  “best"  center  frequency  as  GFFS.  Although  the 
DFT  method  is  somewhat  faster  then  GFFS.  it  does  not  always  predict  useful  center 
frequencies.  Chapter  8  compares  the  GFFS  algorithm  with  the  DFT  method  for  finding 
filter  frequency  parameters  and  describes  other  limitations  of  the  DFT  method. 

7.4  Determining  Filter  Parameters  for  Other  Signature  Types 

The  GFFS  algorithm  presented  in  Section  7.1  assumes  that  to  distinguish  between 
two  textured  regions,  a  step  signature  is  desired.  The  algorithm  can  be  easily  modified 
to  find  Gabor  filters  that  generate  valley/ridge  and  difference-in-variance  signatures.  For 
the  valley/ridge,  samples  are  pooled  from  both  textured  regions  .4  and  B  (for  this  type 
of  signature,  regions  A  and  B  are  typically  identical)  this  forms  sample  set  A.  Samples 
are  then  collected  along  the  texture  boundary  to  give  sample  set  B.  The  rest  of  the 
algorithm  does  not  change.  With  this  modification,  the  algorithm  will  find  the  center 
frequency  that  produces  the  highest  ridge  or  deepest  valley. 

To  determine  Gabor  filters  that  will  produce  the  largest  difference  in  output  vari¬ 
ance,  simply  compare  the  differences  in  sample  variances  and  choose  the  DFT  frequency 
that  generates  the  largest  difference.  It  is  also  necessary  to  check  that  the  means  are 
similar,  as  this  is  an  important  consideration  for  subsequently  transforming  the  differ¬ 
ence  in  variance  to  a  difference  in  mean  (Section  ■S.2.3).  The  question  of  which  type  of 
signature  is  most  appropriate  for  distinguishing  a  given  texture  pair  remains  open. 


Chapter  8 


Results 


This  chapter  presents  1-D  and  2-D  experimental  results  corroborating  the  analysis 
done  in  previous  chapters. 

8.1  1-D  Results 

Fig.  8.1  gives  examples  of  the  1-D  textures  used  in  the  analytical  work  of  Chap¬ 
ter  4,  and  Fig.  8.2  gives  a  plot  of  four  filter  outputs  produced  by  applying  a  Gabor  filter 
to  1-D  textured  images. 

Each  image  consists  of  two  regions  1  and  2,  consisting  of  six  texels.  Region  1  is 
to  the  left  of  zero  and  region  2  is  to  the  right.  The  texels  in  region  1  are  defined  by  (4.1) 
and  the  texels  in  region  2  are  defined  by  (4.2).  All  texels  are  spaced  24  units  apart  and 
are  16  units  wide  (i.e.,  Ax  =  16).  The  texture  frequencies  for  regions  1  and  2  are  ui  and 
u>2  respectively,  and  the  phase  difference  {4>i  —  ^)  between  regions  is  <i>.  By  adjusting 
these  parameters,  discontinuities  in  frequency,  phase,  or  both  can  be  induced  between 
regions. 

Curve  A  is  the  result  of  a  difference  in  texture  frequency  between  regions.  The 
Gabor  filter  used  to  produce  this  curve  has  a  center  frequency  that  matches  the  texture 
frequency  of  region  1.  Thus,  the  Gabor-filter  output  is  greater  for  region  1  than  for  region 
2.  As  the  figure  shows,  the  filter  output  approximates  a  step.  Note  that  the  position  of 
the  texture  boundary  (x  =  0)  is  located  appro.ximately  at  the  middle  of  the  step.  Curve 
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Fig.  8.1.  Examples  of  1-D  textures  complying  to  the  1>D  texture  model  of 
Section  4.1. 

(a)  1-D  texture  constructed  from  the  model  (4.3),  where  the  texture  frequencies 
differ  between  regions  (the  frequency  differences  have  been  exaggerated  for 
clarity). 

(b)  1-D  texture  constructed  from  the  model  (4.3),  where  the  texture  elements 
differ  in  phase  between  regions  (the  frequency  has  been  reduced  for  clarity). 
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Fig.  8.2.  1>D  Gabor-fliter  outputs  m  derived  from  1-D  textures  (define  in  (4.1) 

and  (4.2)).  ui  and  U2  are  the  texture  frequencies  in  radians  per  unit  distance 
for  regions  1  and  2  respectively,  and  4>  is  the  phase  difference  between  regions 
in  radians.  Uc  is  the  filter  center  firequency  in  radians  per  unit  distance.  The 
location  x  =  0  represents  the  texture  boundary. 

Curve  A:  wi  =  10.47,  wj  =  9.43,  Wc  =  10.47,  ^  =  0  (differing  texture  frequency). 
Curve  B:  u>i  =  10.47,  W2  =  10.47,  Wc  =  10.47,  ^  =  jt  (textures  of  same  texture 
frequency,  but  out  of  phase). 

Curve  C:  wi  =  5.03,  wj  =  5.03,  u>c  =  5.03,  ^  =  1.57  (texures  of  same  texture 
frequency,  but  out  of  phase). 

Curve  D:  wi  =  10.24,  u)2  -  10.47,  Uc  =  10.47,  <i)  =  1.5  (differences  in  texture 
frequency  and  phase). 
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B  is  the  result  of  a  difference  in  phase.  Note  that  the  texture  frequency  is  the  same  for 
both  textures,  but  there  is  a  tt  radian  phase  difference  between  them.  Also,  note  that  the 
texture  frequencies  equal  the  center  frequency  of  the  filter.  Observe  that  the  curve  forms 
a  valley  as  expected  from  the  analytical  work.  In  this  case,  the  position  of  the  boundary 
occurs  at  the  global  minimum  of  the  Gabor-filter  output.  Curve  C  also  represents  a 
difference  in  phase,  but  a  ridge  is  formed  in  the  output  rather  than  a  valley.  The  1-D 
analysis,  however,  does  not  reveal  how  a  ridge  can  occur  at  a  phase  discontinuity.  Note 
that  the  texture  boundary  occurs  at  the  global  maximum  (minimum)  of  the  Gabor  filter 
output  for  curve  C  (B).  Curve  D  is  produced  by  a  combination  of  frequency  and  phase 
changes.  In  this  case  the  output  is  in-between  a  step  and  a  valley.  Depending  on  the 
frequency  and  phase  values,  either  the  step  or  the  ridge/vaUey  will  dominate  for  such 
cases.  For  curve  D,  the  texture  boundary  is  neither  at  the  minimum  nor  at  the  middle  of 
the  step.  Thus,  there  exists  an  inherent  degree  of  boundary  uncertainty  with  this  profile, 
as  is  the  case  with  certain  perceived  texture  boundaries.  For  all  filters  used  in  Fig.  8.2, 
a  equals  the  texel  spacing. 

8.2  2-D  Results 

All  images  used  in  the  examples  to  follow  consist  of  two  regions.  Except  for 
the  examples  of  natural  textures,  each  region  is  composed  of  a  collection  of  synthesized 
texels.  Each  te.xel  is  formed  from  line  segments  20  pixels  long  by  2  pixels  wide.  The 
average  intensity  difference  between  regions  is  minimized  by  using  approximately  the 
same  number  of  pixels  in  each  texel.  The  size  of  the  images  is  512  x  512  pixels. 
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Except  where  noted,  the  Gabor-filter  parameters  were  determined  as  follows.  Sec¬ 
tion  6.1  recommended  that  =  Ai  and  Oy  —  Ay.  For  most  of  the  examples,  Ax  =  Ay. 
Hence,  tr j.  =  (Ty  =  cr  =  24  pixels.  The  aspect  ratio  A  =  1.  The  GFFS  algorithm  devel¬ 
oped  in  Chapter  7  was  used  to  determine  the  center  frequency  iU,V).  This  algorithm 
finds  the  harmonic  {U  =  27r^/Ai,V  =  27r//Ay)  that  produces  the  largest  step  (or  the 
deepest  valley,  or  the  highest  ridge  for  texture-phase  differences).  In  the  figure  captions, 
the  Gabor-filter  center  frequencies  iU,V)  are  reported  in  polar  coordinates  {F,4>)  so  that 
the  orientation  of  the  filter’s  sinusoid  is  explicit  (F  =  Q/2k  in  (3.6)). 

The  input  images  are  defined  digitally.  Thus,  aliasing  in  the  images  is  not  a  issue. 
Aliasing  is  an  issue,  however,  for  the  GEF,  since  it  must  be  sampled  before  applying  it  to 
the  image.  Since  the  GEFs  are  not  bandlimited,  some  aliasing  will  occur  regardless  of  the 
sample  rate.  Bovik  et  al.  derived  the  required  sample  rate  for  various  percentages  of  alias 
energy  [53].  In  the  examples,  the  GEFs  are  sampled  so  that  the  energy  due  to  aliasing  is, 
in  most  cases,  <  1%.^  Exceptions  occur  for  the  filter  outputs  shown  in  Figs.  8.11c  and 
8.12d,  where  the  aliasing  energy  is  7.4%  and  12%  respectively.  The  increase  in  aliasing  is 
due  to  the  high  center  frequencies  used  in  these  filters,  but  this  does  not  pose  a  problem. 
The  GFFS  algorithm  still  finds  the  most  discriminating  digitized  filter,  even  though  it 
might  be  somewhat  distorted. 

Since  the  GEFs  are  not  spatially  limited,  truncation  is  necessary.  The  GEFs 
are  truncated  to  a  width  of  6<7,  which  represents  an  error  of  about  0.2%.  Except  in 
Fig.  8.13,  all  points  within  1/2  the  filter  width  from  the  boundary  are  discarded  in  the 

'Bovik  et  a/.’s  calculation  is  conservative  due  to  using  ln2  instead  of  >/ln2  in  their  equation  for  7b 
[53]. 


119 


final  output  to  eliminate  the  wraparound  error  that  arises  in  discrete  convolution.  For 
this  reason  some  of  the  output  figures  appear  truncated.  Some  examples  show  the  results 
of  applying  a  Canny  edge  detector  to  a  filtered  image.  This  gives  possible  subsequent 
segmentations.  No  effort  was  made,  however,  to  optimize  this  detector  or  to  optimize 
the  segmentation  algorithm  applied  to  the  filtered  image.  Appendix  D  discusses  the 
implementation  details  of  filter  application  and  edge  detection. 

8.2.1  Difference  in  Texel  Type 

Fig.  8.3a  illustrates  a  uniformly  textured  image  consisting  of  +s  and  Ls.  Fig.  8.3b 
gives  a  plot  of  a  Gabor-filter  output  m(x,  y)  versus  x  and  y.  The  vertical  axis  gives 
m{x,y)  (the  maximum  and  minimum  filter  outputs  are  indicated  on  the  axis),  and  the 
two  a.xes  approximately  horizontal  and  into  the  page  represent  x  and  y.  All  Gabor-filter 
outputs  are  depicted  this  way. 

The  shape  of  the  profile  is  predominantly  a  step  function  with  some  undershoot 
present.  The  output  of  a  Canny  edge  detector  [75]  applied  to  the  Gabor-filtered  image 
is  shown  superimposed  on  the  original  image  -  see  Fig.  8.3c.  As  Figs.  8.3b  and  8.3c 
indicate,  the  boundary  between  the  two  textured  regions  is  well  localized. 

An  estimate  of  step  height  in  Fig.  8.3b  can  be  found  by  using  the  equations  for  Ai 
(5.22)  and  Aj  (5.23).  These  equations  imply  that  the  ratio  jTil/ITjl  is  a  relative  measure 
of  the  step  height.  Letting  Ti  correspond  to  a  “-h”  and  T2  correspond  to  an  “L”  for  the 
ims^e  in  Fig.  8.3a,  gives  Ti  =  22.38  and  T2  =  -1.  Thus,  the  predicted  step  height  is 
|7’i|/|T2|  =  |22.38|/|  -  1|  =  22.38.  For  Fig.  8.3b,  the  ratio  of  left- region  and  right-region 
heights  is  22.70,  which  is  in  good  agreement  with  the  predicted  value. 
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Table  8.1.  Comparison  of  actual  and  predicted  Gabor-filter  output  values  for 
the  step  signature  in  Fig.  8.3b. 


location 

actual 

predicted 

-1-  region 

20.0 

22.4 

max.  undershoot 

0.010 

0.024 

L  region 

0.88 

1.00 

In  this  example,  undershoot  occurs  in  the  Gabor-filter  output,  because  T2  is  neg¬ 
ative.  This  phenomena  is  discussed  in  Section  5.2.1  and  in  Appendix  B.  Although  (5.21) 
demonstrates  the  possibility  of  undershoot,  (5.18)  must  be  evaluated  to  determine  the 
position  and  extent  of  undershoot.  Letting  T\  =  22.38  and  T2  =  -1  in  (5.18),  an 
estimate  of  the  signature  in  Fig.  8.3b  was  computed.  Table  8.1  compares  actual  and 
predicted  signature  values  at  selected  positions.  (The  actual  values  have  be-3n  scaled  by 
a  constant  factor  for  comparison.)  This  example  illustrates  that  even  if  the  Gabor-filter 
center  frequency  equals  an  harmonic,  undershoot  can  occur:  however,  overshoot  cannot 
occur  for  this  situation,  as  discussed  in  Appendix  B. 

Both  overshoot  and  undershoot  can  occur  if  the  Gabor  filter  is  not  tuned  to  an 
harmonic.  Fig.  8.3d  gives  an  example.  This  figure  represents  the  output  of  a  Gabor 
filter  having  center  frequency  {U  =  0.0283,  V'  =  -0.0283)  (input  is  Fig.  8.3a).  The 
texel  spacing  is  24  pixels.  Thus,  the  closest  harmonic  to  the  filter  center  frequency  is 
kj =  //Ajz  =  1/24  =  0.0417.  The  center  frequency  then  is  displaced  6U  =  6V  = 
0.0134  cycles/pixel  away  from  the  nearest  harmonic.  Using  these  parameters  in  (B.l) 
to  compute  predicted  values  of  Gabor-filter  output  m(i,  •)  at  selected  x  reveals  good 
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Table  8.2.  Comparison  of  actual  and  predicted  Gabor-filter  output  values  for 
the  step  signature  in  Fig.  8.3d. 


location 

actual 

predicted 

-f  region 

33.5 

38.2 

max.  overshoot 

88.7 

85.1 

max.  undershoot 

0.11 

0.80 

L  region 

1.35 

1.71 

agreement  with  the  actual  values  of  Fig.  8.3d  (see  Table  8.2). 

8.2.2  Difference  in  Texel  Orientation 

Fig.  8.4a  shows  a  uniformly  textured  image  consisting  of  texels  that  differ  in 
orientation.  Figs.  8.4b  and  8.4c  show  the  outputs  of  two  Gabor  filters  that  use  the  same 
center  frequency  but  use  different  values  for  a.  In  Fig.  8.4b,  a  equals  the  texel  spacing, 
and  a  smooth  step  signature  is  achieved.  The  region  on  the  right  produces  the  greatest 
Gabor-filter  output  m,  because  the  orientation  of  the  Gabor-filter  sinusoid  matches  the 
texel  orientations  on  the  right.  In  Fig.  8.4c,  <7  =  8,  which  is  1/3  of  the  texel  spacing.  For 
this  small  <t,  the  GEF  doesn’t  cover  multiple  texels  as  it  moves  across  the  image,  resulting 
in  ripple  in  the  filter  output.  The  resulting  signature,  though,  exhibits  a  sharper  step 
transition  than  Fig.  8.4b. 

8.2.3  Differences  in  Horizontal  and  Vertical  Texel  Spacing 

Fig.  8.5a  is  equivalent  to  Fig.  8.3a,  with  the  exception  that  Ay  =  2Ax  for  the 
two  textures.  Fig.  8.5b  shows  a  corresponding  Gabor  filter  output  when  its  GEF  uses 
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a  symmetrical  Gaussian  (i.e.,  Cx  =  CTy  ot  \  =  1).  Note  the  occurrence  of  significant 
ripple  in  the  y  direction.  Fig.  8.5c  gives  the  Gabor-filter  output  for  a  filter  with  A  =  2 
{(Ty  =  2crx).  For  this  case,  the  output  is  “smooth”  in  both  x  and  y  and  resembles  the 
smooth  step  of  Fig.  8.3b. 

8.2.4  Texture- Phase  Differences 

Fig.  8.6a  consists  of  two  identically  textured  regions,  but  the  regions  are  shifted 
relative  to  each  other  in  both  the  x  and  y  directions  (Fig.  1.3b  gives  a  simpler  example). 
The  texel  spacing  is  24  pixels  in  both  x  and  y,  and  the  region  on  the  right  is  shifted  -8 
pixels  in  the  y  direction  and  -4  pixels  in  the  x  direction  relative  to  the  region  on  the  left. 
Fig.  8.6b  shows  the  output  of  a  Gabor  filter  that  is  tuned  to  a  frequency  that  is  close  to 
an  harmonic.  Note  that  a  valley  signature  results. 

The  harmonics  are  located  at  (0.042A:,  0.042/)  and  the  center  frequency  of  the  filter 
IS  {U  =  0.0039,  V  =  -0.0410);  so  the  indices  of  the  nearest  corresponding  harmonic  are 
k  =  0  and  /  =  -1.  The  filter’s  specified  center  frequency  {U,V)  differs  from  one  of  the 
harmonics  by  an  amount  {6U  =  0.039,  SV  =  0.001).  From  the  analysis  of  Section  5.2.2, 
a  ridge-signature  output  is  expected.  Using  the  method  developed  in  Appendix  C  for 
computing  2,  however,  reveals  that  z  —  0.068.  Thus,  even  though  6U  is  nonzero,  a  valley 
is  predicted  rather  than  a  ridge.  This  occurs  because 

0  =  27r[Wi/Ai  -t-  Uyj ^y\  —  27r(-l)(— 8/24)  radians  =  120“ 


which  is  significantly  different  from  t/’max  Table  5.1.  The  predicted  depth  of  the  valley 
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then  is  0.068  times  the  Gabor-filter  output  at  points  far  removed  from  the  transition. 
The  maximum  and  minimum  outputs  of  Fig.  8.6b  imply  that  the  valley  is  in  fact  0.124 
times  the  value  at  remote  points.  This  error  can  be  explained  in  part  by  the  sensitivity  of 
(5.42)  to  V’-  Recall  that  0  depends  on  the  texture-phase  shift  in  the  image.  Reducing  the 
phase  shift  in  the  y  direction  by  just  1/2  pixel  changes  the  predicted  value  to  z  =  0.141. 

Again  using  Fig.  8.6a,  a  Gabor  filter  was  applied  whose  center  frequency  exactly 
matched  the  first  harmonic  in  u  (output  not  shown  but  similar  to  Fig.  8.6b).  For  this 
case,  the  expression  for  the  relative  depth  of  the  valley  is  derived  from  (5.29),  which  is 
much  less  sensitive  to  xp.  Using  (5.29)  results  in  =  0.5,  which  differs  by  only  7%  from 
the  actual  valley  depth. 

Fig.  8.6c  shows  the  result  of  processing  Fig.  8.6a  using  a  Gabor  filter  that  is  tuned 
to  a  frequency  significantly  displaced  from  an  harmonic.  A  ridge  signature  results.  The 
harmonics  are  located  at  (0.042  •  A:,0.042  •  /)  and  the  center  frequency  of  the  filter  is 
{U  =  0.012,U  =  —0.0410),  so  ^  =  0  and  /  =  1.  (Ti  and  try  again  equal  24.  From  (5.26) 

0  =  2ir[k6x/Ax  -f-  UylAy]  =  27r(l)(-8/24)  radians  =  —120° 

and  the  method  developed  in  Appendix  C  predicts  a  ridge  height  z  =  2.973.  The  actual 
ridge  height  in  Fig.  8.6c  is  3.24,  a  9%  error. 

8.2.5  Texel-Spacing  Difference  between  Regions 

Fig.  8.7a  shows  a  uniformly  textured  image  similar  to  Fig.  8.3a,  except  that  the 
texel  spacing  differs  between  the  two  regions.  Fig.  8.7b  shows  a  corresponding  Gabor- 
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filter  output,  where  the  filter  is  tuned  to  an  harmonic  corresponding  to  the  region  of 
+s.  Although  the  signature  is  predominantly  a  step,  some  undershoot  is  present  near 
the  texture  boundary.  Fig.  8.7c  shows  a  similar  filter  output,  but  now  the  filter  is  tuned 
to  an  harmonic  for  the  region  of  Ls.  In  this  case,  both  overshoot  and  undershoot  are 
present.  Observe  how  in  each  case  (Figs.  8.7b  and  8.7c)  that  the  region  producing  the 
greatest  filter  response  corresponds  to  the  one  whose  harmonic  matches  the  filter  center 
frequency.  The  analysis  of  Appendix  B  verifies  this  empirical  result. 

8.2.6  Nonuniform  Textures 

Fig.  8.8a  depicts  a  nonuniformly  textured  image  produced  by  introducing  ran¬ 
dom  orientations  and  positional  perturbations  into  the  texels  (-bs  and  Ls)  of  Fig.  8.3a. 
Fig.  8.8b  shows  a  filter  output.  The  random  effects  cause  large  fluctuations  in  the  output. 
Fig.  8.8c  shows  the  result  of  applying  a  Canny  edge  detector  to  Fig.  8.8b.  Because  of 
the  fluctuations,  the  detected  boundary  does  not  perfectly  match  the  “actual”  boundary. 
The  predicted  boundary  is,  for  the  most  part,  correct  to  within  ±1/2  texel.  For  typi¬ 
cal  nonuniform  textures  (where  the  actual  texture  boundary  is  not  well  defined),  such 
fluctuation  in  the  computed  texture  boundary  is  expected. 

Section  7.2  presented  a  synthetic  texture  pair  consisting  of  triangles  and  arrows, 
and  a  corresponding  Gabor-filter  output  signature  exhibiting  a  step  signature  (Fig.  7.1). 
This  image  can  also  produce  a  signature  exhibiting  a  step  change  in  average  local  output 
variation.  Fig.  8.9a  shows  a  Gabor  filter  output  exhibiting  such  a  signature  (the  input 
image  is  Fig.  7.1a).  After  applying  (5.43),  the  change  in  average  local  output  variation 
was  transformed  to  the  step  signature  shown  in  Fig.  8.9b. 
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(c)  Canny  edge-detector  output  superimposed  on  input  image. 
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As  previous  examples  illustrate,  a  single  stage  filter  (Cl  configuration  -  Sec¬ 
tion  7.2)  often  produces  adequate  results;  however,  this  is  not  always  the  case.  The 
following  examples  illustrate  that  signature  quality  can  sometimes  be  improved  by  using 
a  C2  configuration  filter  (see  Section  7.2).  Frequency  parameters  were  determined  by  the 
GFFS  algorithm,  and  results  are  compared  to  filters  with  frequencies  determined  by  the 
DFT  method.  Both  natural  and  synthetic  textures  are  examined.  All  natural  textures 
were  digitized  (512  x  512  pixels,  256  graylevels)  from  Brodatz  [13].  and  all  texture  pairs 
were  adjusted  for  equal  average  intensity.  For  simplicity,  only  symmetrical  Gabor  filters 
were  used;  i.e.,  These  examples  match  up  textures  from  various  classes 

(per  Rao’s  classification  [12]).  This  gives  a  broad,  strong  test  for  the  validity  of  the 
analyses  in  Chapter  5  and  the  efficacy  of  the  GFFS  algorithm  developed  in  Chapter  6. 

Let  us  begin  by  examining  the  triangles-and-arrows  image  in  Fig.  7.1  in  more 
detail.  Fig.  7.1c  is  the  output  of  a  Gabor-filter  (Cl  configuration)  with  the  frequency 
parameters  determined  by  GFFS  and  a  equal  to  the  te.xel  spacing.  For  this  example, 
GFFS  predicts  similar  frequency  parameters  for  (t's  ranging  from  12  to  36  pixels.  So,  in 
this  case,  the  results  are  insensitive  to  a.  Figs.  7.  Id  and  7.1e  show  the  two  stages  of  a  C2 
configuration  filter  applied  to  Fig.  7.1a.  Fig.  7.1b  shows  the  result  of  applying  a  Canny 
edge  detector  to  Fig.  7.1e.  The  DFT  method  was  also  applied  to  Fig.  7.1a.  DFT  predicts 
a  similar  radial  frequency,  but  the  orientation  angle  o  has  the  opposite  sign  (—45®  as 
opposed  to  45°).  The  GFFS  algorithm  ranks  the  DFT  prediction  20^^  {a  =  24),  with 
a  corresponding  P;  value  of  0.060.  The  output  signature  for  the  DFT-predicted  value 
(not  shown)  was  examined  and  found  to  be  similar  to  Fig.  7.1c. 

Fig.  8.10a  shows  a  pair  of  natural  te.xtures.  The  left  region  is  Brodatz’s  “grass 
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lawn”  (D9)  and  the  right  region  is  “cotton  canvas”  (D77).  D9  is  an  example  of  a  disor¬ 
dered  texture,  while  D77  is  strongly-ordered  [12].  GFFS  again  predicts  the  same  center 
frequency  for  a  ranging  from  12  through  36;  however,  in  this  case,  it  agrees  with  the 
DFT  predicted  values.  Fig.  8.10c  shows  the  result  of  applying  a  Cl  configuration  filter 
to  Fig.  8.10a.  In  this  case  Pe  <  0.00001,  which  is  consistent  with  the  improved  signa¬ 
ture  quality  compared  to  Fig.  7.1c.  Figs.  8.10d  and  S.lOe  show  the  two  stages  of  a  C2 
configuration  filter  applied  to  Fig.  8.10a.  For  this  example,  high  quality  signatures  are 
obtained  for  both  Cl  and  C2  configurations.  Fig.  8.10b  shows  the  result  of  applying  a 
Canny  edge  detector  to  Fig.  S.lOe. 

Fig.  8.11a  consists  of  “straw  matting”  (D-So)  and  “raffia”  (D84).  Both  of  these 
textures  are  strongly-ordered  [12].  Notice  that  although  the  te.xels  in  the  two  regions 
are  perceptually  different,  they  are  similar  in  size,  orientation,  and  aspect  ratio.  In 
this  case,  for  both  =  36  and  a  =  24,  GFFS  predicts  two  very  different  frequencies 
{F  =  0.491, </>  =  59.4°  and  F  =  0.046,0  =  87.4°)  with  similar  error  probabilities.  The 
center  frequency  corresponding  to  the  smaller  value  of  F  agrees  with  the  DFT  predicted 
value.  .4t  a  —  12,  however,  GFFS  predicts  fs  that  are  all  much  greater  than  the  DFT 
value.  In  general,  it  can  be  expected  that  for  large  crs.  DFT  and  GFFS  will  predict 
similar  values.  This  is  because  as  the  effective  window  size  approaches  the  size  of  the 
region,  the  windowed  Fourier  transform  approaches  the  discrete  Fourier  transform  of  the 
entire  region.  Fig.  S.llc  shows  the  result  of  applying  a  Cl  configuration  filter  to  Fig. 
8.11a.  Figs.  8. lid  and  8.1  le  show  the  two  stages  in  applying  a  C2  filter  to  the  same 
texture  pair.  Here  we  can  begin  to  see  a  difference  in  signature  quality  between  the  Cl 
and  C2  configurations.  Fig.  8.11b  shows  the  result  of  applying  a  Canny  edge  detector 
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,y):  9.36  X  10**,  2.17  x  10**. 
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to  Fig.  8.11c.  (In  this  case,  edge  localization  is  slightly  better  for  the  Cl  output.) 

Fig.  8.12a  consists  of  “pressed  cork”  (D4)  and  “beach  sand”  (D29).  Rao  classifies 
D4  as  disordered  and  D29  as  strongly-ordered  [12].  In  spite  of  this  difference  in  type,  the 
large  variation  in  grain  size  of  the  sand  makes  these  textures  perceptually  very  similar. 
In  this  case,  the  values  predicted  by  DFT  and  GFFS  differ  greatly  even  for  a  =  36. 
Fig.  8.12b  shows  the  result  of  applying  a  Cl  filter  to  Fig.  8.12a  with  the  DFT  predicted 
frequency  values.  Note  that  although  several  peaks  occur,  there  seems  to  be  no  indication 
of  the  texture-boundary  location.  A  C2  filter  was  also  applied  to  this  texture  pair  with 
the  same  frequency  value  as  in  Fig.  8.12b,  but  with  a  =  12.  The  result  (not  shown) 
shows  little,  if  any,  improvement.  In  this  case,  the  DFT  method  fails  to  predict  a  useful 
center  frequency.  Also  tested  were  the  ne.xt  ttvo  best  frequency  values  predicted  by  the 
DFT  method  with  similarly  poor  results  (not  shown).  Fig.  8.12c  shows  the  result  of 
applying  a  Cl  filter  to  Fig.  8.12a  using  the  center  frequency  predicted  by  GFFS.  Here, 
using  a  large  value  of  a  produces  too  much  discrimination  within  the  D29  region.  By 
reducing  a  and  applying  a  C2  configuration  filter  to  this  texture  pair,  the  sequence  shown 
in  Figs.  8.12d  and  8.12e  was  obtained.  Note  the  high  quality  of  the  step  after  the  second 
stage. 

As  the  previous  examples  indicate,  the  DFT  method  can  predict  suitable  filter 
parameters.  It  is  also  faster  than  GFFS.  It  is.  however,  deficient  in  the  following  ways: 

•  The  DFT  method  does  not  always  predict  the  same  center  frequency  as  GFFS. 

When  this  occurs,  GFFS  produces  a  better  signature. 
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•  GFFS,  with  its  quantitative  measure  of  signature  quality,  provides  a  way  of  es¬ 
timating  the  relative  discriminability  among  different  texture  pairs.  The  DFT 
provides  no  such  indication. 

•  In  some  cases,  the  best  choice  of  center  frequency  depends  on  ffx  and  Oy.  With 
GFFS,  the  best  center  frequency  can  be  determined  for  any  value  of  Ox  and  Oy. 
The  DFT  method,  on  the  other  hand,  can  produce  only  one  value  based  on  the 
size  of  the  entire  region. 

•  Section  7.2.5  described  how  zero  padding  can  increase  the  number  of  center  fre¬ 
quencies  tested.  In  this  way,  it  is  possible  to  check  that  signature  quality  degrades 
gracefully  as  frequency  varies.  In  practice,  GFFS  predicts  similar  signature  quality 
over  a  wide  range  of  adjacent  frequency  components.  For  the  DFT  method,  the 
best  choices  often  occur  at  very  different  frequency  values.  Thus  it  is  uncertain 
how  sensitive  the  “best”  frequency  is  to  slight  frequency  perturbations. 

•  The  DFT  method  cannot  predict  center  frequencies  for  the  valley/ridge  or  difference- 
in-variance  signatures. 

8.2.7  Miscellaneous  Texture  Examples 

The  examples  above  are  meant  to  typify  Gabor-fUter  outputs,  but  there  are  ex¬ 
ceptional  cases.  For  example,  if  a  filter  is  tuned  to  a  frequency  component  that  has 
similar  magnitude  in  both  textured  regions,  the  output  can  be  non-discriminating;  i.e., 
the  filter  is  not  appropriate  for  discriminating  between  these  two  regions.  If  the  regions 
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are  uniform,  then  the  filter  output  will  be  flat.  If  they  are  nonuniform,  then  the  filter  out¬ 
puts  may  exhibit  many  fluctuations  and  show  no  distinguishing  characteristics  between 
regions.  In  the  extreme  case,  when  the  frequency  composition  of  the  regions  are  similar, 
the  image  cannot  be  segmented.  An  example  of  such  an  image  is  the  nonuniform  texture 
pair  consisting  of  Rs  and  mirror-image  Rs  presented  in  Fig.  2.3.  Although  many  Cl  and 
C2  configuration  filters  were  applied  to  this  image  (results  not  shown),  no  distinct  output 
signatures  were  found.  It  appears  that  the  filter  configurations  developed  here  are  inef¬ 
fective  in  segmenting  this  texture.  While  this  might  be  considered  a  limitation,  at  least 
it  agrees  with  perception.  It  should  be  noted  that  because  of  the  inherent  variability 
within  textures,  no  one  algorithm  can  successfully  segment  aU  textures.  For  without  a 
mathematical  definition  of  texture,  who  is  to  say  when  a  difference  in  input  corresponds 
to  a  difference  in  texture  or  simply  an  acceptable  variation  within  the  texture?  Thus, 
using  human  perception  as  a  benchmark  does  not  seem  unreasonable. 

Another  exceptional  case  is  when  a  filter  is  tuned  to  a  frequency  band  not  in¬ 
volved  in  determining  a  difference  in  texture.  In  this  case,  a  discontinuity  might  occur 
at  a  location  other  than  the  texture  boundary.  This  “problem”  also  exists  for  the  human 
visual  system  in  the  form  of  optical  illusions  and  the  perception  of  structures  within 
structures.  Fig.  8.13  is  an  example  demonstrating  this  phenomenon.  Fig.  8.13a  is  an 
example  of  structure  within  texture  (a  similar  example  was  mentioned  in  Chapter  2), 
first  presented  by  Beck  [9].  The  left  region  consists  of  alternating  rows  of  right-facing- Us 
and  left-facing- Us,  while  the  right  region  consists  of  alternating  columns.  The  interesting 
feature  here  is  that  the  texture  boundary  is  not  readily  perceived.  Rather,  one  tends 
to  see  three  vertical  black  bars  on  the  right.  Beck  refers  to  these  bars  as  “emergent” 
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features.  Fig.  8.13b  shows  the  Gabor-filter  output  for  a  Cl  configuration  filter  (param¬ 
eters  determined  by  experiment).  Note  that  the  black  bars  are  readily  distinguished 
by  the  three  vertical  ridges  in  Fig.  8.13b,  whereas  the  texture  boundary  produces  no 
distinct  output  feature.  This  suggests  that  these  simple  filters  are  capable  of  detecting 
features  previously  thought  to  involve  more  complex  processing  (e.g.,  edge  detection, 
feature  linking,  inhibitory  interactions)  [9,  56,  62]. 


Chapter  9 


Conclusion 


This  thesis  studies  the  design  of  filters  for  texture  segmentation.  It  provides 
mathematical  and  experimental  evidence  suggesting  that  the  application  of  Gabor  filters 
to  a  textured  image  produces  certain  characteristic  output  signatures,  that  are  useful 
for  segmenting  the  image.  Although  the  quantitative  analysis  is  limited  to  a  simplified 
texture  subset,  qualitative  arguments  and  experimental  results  are  provided  indicating 
that  the  results  apply  in  general. 

Signature  characteristics  can  best  be  described  by  dividing  textures  into  two 
classes:  uniform  and  nonuniform.  For  the  class  of  uniform  textures,  output  signatures 
occur  in  one  of  three  forms,  either  a  step,  valley,  or  ridge.  Analysis  shows  that  the 
step  signature  occurs  when  two  textured  regions  differ  in  texel-frequency  composition. 
On  the  other  hand,  the  valley  and  ridge  signatures  occur  when  two  regions  exhibit  a 
texture-phase  discontinuity.  The  regular  nature  of  uniform  textures  produces  smooth, 
well  behaved  signatures  that  are  easy  to  segment  using  edge-detection  methods.  For  the 
step  signature,  region-based  techniques  would  also  be  effective.  For  nonvniform  textures, 
the  step,  valley,  and  ridge  signatures  still  occur:  however,  the  presence  of  texel  variation 
induces  local  fluctuations  in  the  signatures.  This  makes  subsequent  segmentation  less 
precise.  For  nonuniform  textures,  a  fourth  signature  type  can  occur.  This  signature 
takes  the  form  of  a  step  change  in  average  local  output  variation. 
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Although  the  characteristic  signatures  mentioned  above  are  useful  for  texture  seg¬ 
mentation,  quality  signatures  occur  only  when  the  five  filter  parameters  iai,(7y,U,V',9) 
are  “tuned”  to  the  texture  being  processed.  Analysis  shows  that  the  choice  of  (aj^,<7y) 
depends  on  the  texel  spacing  within  the  texture.  This  choice  is  a  tradeoff  between  sig¬ 
nature  smoothness  and  accurate  texture-boundary  localization.  It  is  further  shown  that 
when  texel  spacing  differs  in  x  and  y,  asymmetric  filters  (i.e.,  filters  consisting  of  a  non- 
circularly  symmetric  Gaussian)  can  be  beneficial.  For  asymmetric  filters,  9  should  be 
chosen  to  match  the  texel-spacing  lattice.  The  choice  of  filter  center  frequency  (U,V), 
on  the  other  hand,  is  determined  both  by  the  texel  spacing  and  by  the  difference  in 
frequency  content  between  texels  in  different  regions.  The  texel  spacing  determines  cer¬ 
tain  frequency  harmonics  in  the  texel  Fourier  transform.  It  is  shown  that,  in  order  to 
avoid  signature  anomalies  called  overshoot  and  undershoot,  iU,V)  should  equal  one  of 
these  harmonics.  It  is  further  shown  that  the  harmonic  that  differs  most  between  texels 
in  different  regions  produces  the  “best”  signature  (i.e.,  greatest  amplitude  difference). 
These  frequency  guidelines,  however,  were  developed  for  raifor  ’  textures  and  are  only 
approximately  correct  for  nonuniform  and  natural  textures.  So,  to  provide  effective  fre¬ 
quency  parameters  for  textures  in  general,  an  algorithm  was  developed  that  finds  the 
“best”  center  frequencies  for  any  given  texture  pair. 

The  filters  used  in  this  thesis  are  b2ised  on  Gabor  elementary  functions.  They  were 
chosen  because  they  have  certain  desirable  properties  for  texture  analysis.  They  are  also 
suggested  frequently  in  the  literature.  Results  show,  however,  that  it  is  the  bandpass 
characteristic  of  these  functions  that  is  responsible  for  producing  signatures;  therefore, 
any  class  of  functions  that  exhibit  bandpass  characteristics  and  are  well  localized  in  both 
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the  space  and  spatial-frequency  domains  would  suffice. 

This  thesis  provides  a  detailed  analysis  of  individual  filters.  There  are,  however, 
two  questions  that  remain  unresolved:  Should  an  alternate  nonlinearity  be  used  after 
applying  a  bandpass  filter?  How  should  filters  be  configured  to  form  a  filter  bank?  The 
question  of  choosing  a  nonlinearity  is  discussed  in  Appendix  A.  The  issue  of  filter  bank 
coUi  guration  is  briefly  discussed  below. 

It  is  clear  that  in  an  autonomous  texture-segmentation  architecture,  filters  cannot 
be  customized  to  individual  textures.  In  principle,  a  bank  of  filters  is  required  that  span 
the  expected  orientation  and  frequency  domain  of  the  textures  of  interest.  Although 
certain  constraints  on  filter-bank  design  were  presented  in  Chapter  6,  two  major  issues 
remain: 

1.  What  characteristics  of  the  filter  output  should  be  used  for  segmentation  -  discon¬ 
tinuities  at  texture  boundaries  or  texture-region  information? 

2.  How  should  the  response  from  multiple  filters  be  integrated  into  a  meaningful 
output  -  should  one,  possibly  dominant,  output  be  selected  as  representative  (filter- 
output  selection),  or  should  many  filter  outputs  be  combined  into  a  kind  of  feature 
vector  (filter-output  combination)? 

This  thesis  shows  that  discontinuities  in  a  single  filter-output  can  be  effective  for  texture 
segmentation;  thus  discontinuity  detection  and  filter  selection  seem  appropriate.  In  fact, 
for  textures  exhibiting  only  a  texture-phase  discontinuity  as  in  Fig.  1.3,  discontinuity 
detection  is  required,  since  the  textured  regions  are  identical.  For  texture  classification, 
however,  discontinuities  alone  cannot  provide  sufficient  information.  It  seems  that,  for 
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this  task,  region  data  from  multiple  filters  is  required.  In  some  cases,  it  seems  that 
multiple  filter  outputs  are  also  required  for  texture  segmentation,  especially  if  simulating 
human  performance  is  desired.  Consider  the  uniform  texture  pair  in  Fig.  3.3c.  The 
human  visual  system  has  difficulty  in  segmenting  this  image;  yet,  many  Gabor  filters 
produce  distinct  filter-output  discontinuities  at  the  texture  boundary.  This  is  true,  in 
fact,  for  any  uniform  texture  pair  whose  regions  have  different  constituent  texels.  This 
occurs  because  regions  with  different  texels  (and  thus  different  texel  Fourier  transforms) 
produce  different  filter  outputs.  And,  due  to  the  lack  of  “noise”  in  uniform  textures, 
even  small  filter-output  differences  are  detectable.  So  why  is  Fig.  3.3c  so  difficult  to 
segment  for  humans?  Examining  the  Fourier  transform  magnitudes  of  the  texels  in 
Fig.  3.3c  reveals  that  although  occasional  differences  exist  between  the  two  transforms, 
on  the  average  they  are  quite  similar  (much  more  so  than  for  the  texels  of  Fig.  3.3a). 
This  suggests  that  the  human  visual  system  might  be  pooling  information  from  multiple 
filters  and  basing  segmentation  on  some  form  of  average  (possibly  with  thresholding). 
This  procedure  would  also  provide  noise  immunity.  Clearly,  both  of  these  issues  require 
additional  study. 


Appendix  A 


Choosing  the  Nonlinearity 


This  appendix  further  discusses  the  choice  of  nonlinearity  for  the  Gabor  filter 
(3.1).  As  shown  below,  computing  the  magnitude  after  filtering  results  in  a  loss  of 
inform.ation.  In  particular,  the  phase  component  of  the  filter  output  is  discarded.  In 
fact,  some  experiments  have  shown  that  the  image-phase  component  is  more  important  in 
preserving  image  quality  than  is  the  amplitude  component  [81].  To  avoid  this  information 
loss,  some  researchers  have  proposed  methods  for  extracting  phase  information  directly 
[53,  82]. 

The  primary  motivation  for  ignoring  phase  comes  from  psychophysical  stud¬ 
ies  of  the  human  visual  system.  Although  neurophysiological  evidence  suggests  that 
quadrature-pair  filters  might  exist  in  the  visual  cortex  [83]  (thus  enabling  phase  detec¬ 
tion),  certain  psychophysical  results  suggest  that  humans  do  not  encode  phase  informa¬ 
tion  directly  [84],  at  least  not  for  texture  segmentation.  Consequently,  some  researchers 
have  explicitly  eliminated  phase  information  from  their  texture-segmentation  algorithms 
[38,  56].  Although,  admittedly,  information  is  lost  by  ignoring  phase,  some  phase-related 
phenomena  can  be  recovered  directly  from  the  amplitude  envelope  (see  Bovik  [66]  for  a 
discussion  of  the  effect  of  phase  on  the  amplitude  envelope).  This  is  because  the  phase 
and  amplitude  components  are  not  independent:  a  change  in  one  will  produce  a  change 


in  the  other. 
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To  further  explore  these  points,  suppose  that  instead  of  computing  the  magni¬ 
tude  of  the  GEF-filtered  image,  one  simply  demodulates  this  image  and  applies  a  lowpass 
filter.  Demodulating  the  GEF-filtered  image  (5.17),  essentially  eliminates  the  complex 
exponential  leaving  a  pair  of  offset  gates  with  complex  coefficients  Ti  and  T2-  Contrast 
this  with  (5.18),  where  the  magnitude  operation  has  been  applied,  and  only  the  magni¬ 
tudes  of  Ti  and  Tj  determine  the  gate  amplitudes.  If  Ti  and  Tj  differ  only  in  sign,  the 
Gabor-filter  output  m  in  (5.18)  is  nondiscriminating.  This  implies  that  if  textures  differ 
only  in  the  sign  of  contrast,  the  textures  cannot  be  discriminated.  This  is  precisely  the 
argument  used  by  Malik  and  Perona  in  criticizing  the  magnitude  computation  [56].  By 
using  the  demodulation  approach,  though,  the  coefficients  Ti  and  T2  after  demodulation 
are  complex.  Hence  the  filter  output  will  reflect  not  only  differences  in  sign,  but  also 
differences  in  phase.  Although  this  approach  is  more  discriminating,  the  sensitivity  to 
phase  will  lead  to  segmentations  that  are  not  consistent  with  human  performance. 

There  is  a  method  for  retaining  sign  differences  between  Tj  and  T2  while  ignoring 
phase  information.  The  method  involves  convolving  the  image  with  only  the  real  portion 
of  a  GEF.  More  precisely,  define  a  new  filter  hr  as  a,  Gaussian  modulated  sinusoid  (cf. 
(3.2)) 

hrix,y)  =  g{x'y')cos[{Ux  +  V  y)  +  4>]  (A.l) 

where  <t>  is  some  arbitrary  constant  phase  angle.  Let  be  the  Fourier  transform  of  hr. 


Hriu,  v)  =  l/2iH+e-^'^  + 


(A.2) 
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where 

H+  =  exp|-i[K[«  +  £^]')^  +  (<^y[v  + 

H-  =  exp|-i[(<7x[u- + 

Note  that  Hr  is  similar  to  (3.4)  except  that  it  is  symmetric  about  the  frequency  origin. 
Applying  Hr  to  /  in  (5.9)  gives  (cf.  (5.14-5.15)) 


(A.3) 


where 


/.+  =  H+e-^<^S{u+U,v  +  V)[TiiU,V)-\-T2{U,V)€-^^('^+^)} 

Ir-  =  H.e^^S{u-U,v-V)\Tii-U,-V)  +  T2{-U,-V)e-^^^'^-^^} 


Note  that  Ir  is  equivalent  to  the  sum  of  Jj  in  (5.14)  and  the  mirror  image  of  //. 

We  now  demodulate  and  lowpass  filter  Ir-  This  is  equivalent  to  shifting  both  // 
and  the  mirror  image  of  Ij  to  the  origin  and  summing  them.  This  results  in 

h{u,v)  ^  K  {Ti(f/,  V)  -h  T2({/,y)€--'^“}  -H  A'  {t,{-U,-V)  +  Tji-U,  -F)e-^’’“} 

(A.4) 

where 
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For  real  images  Ti{U,V)  and  Ti{  —  U,—V)  are  complex  conjugates,  as  are  T2{U,V)  and 
Tz^  —  U,  —V).  Therefore,  (A.4)  reduces  to 

/d(u,  v)  a  2K  {Re[Ti]  +  Re[T2]e-^"“}  ( A.5) 

By  arguments  similar  to  those  used  for  deriving  the  step  signature  in  Section  5.2.1, 
the  final  output  of  the  alternate  filter  td(x,  y),  which  is  given  by  the  inverse  Fourier 
transform  of  (A.5),  approximates  two  offset  gate  functions  coincident  with  the  region 
boundaries.  The  amplitudes  of  the  gates  are  proportional  to  the  real  parts  of  Ti  and 
T2  rather  than  their  magnitude.  Thus  the  sign  is  preserved  without  retaining  phase 
information.  This  approach  then  can  discriminate  textures  whose  texels  differ  only  in 
the  sign  of  contrast. 

It  is  important  to  point  out  that  all  of  the  operations  used  here  are  linear,  and 
as  mentioned  in  Chapter  3,  some  form  of  nonlinearity  is  essential  in  simulating  human 
performance  (assuming  this  is  desirable).  If  a  suitable  nonlinearity  could  be  found  and 
imposed  after  demodulation,  this  method  could  provide  an  alternative  to  the  more  elab¬ 
orate  architecture  proposed  by  Malik  and  Perona  [56]. 

Farrokhnia  and  Jain  used  hr  in  their  texture-segmentation  work,  but  still  em¬ 
ployed  the  magnitude  computation  [69,  70);  however,  it  was  just  shown  that  the  sign 
information  is  still  lost  with  this  approach.  The  contribution  here  is  analytically  show¬ 
ing  the  potential  of  demodulating  the  filter  output  and  contrasting  this  approach  with 
other  methods.  It  is  important  to  realize  that  simply  using  hr  bs  a  filter  does  not  guar¬ 
antee  that  sign  information  will  be  preserved. 


Appendix  B 


Overshoot  and  Undershoot 


Section  5.2.1  considered  what  happens  when  a  Gabor  filter  is  applied  to  a  uni¬ 
formly  textured  image  that  contains  two  textures  whose  respective  tcxels  differ.  As  was 
siiown,  if  the  Gabor  filter  is  not  tuned  to  an  harmonic,  the  step  signature  can  exhibit 
overshoot  and/or  undershoot  near  the  texture  boundary.  Also,  if  the  Gabor  filter  is 
tuned  to  an  harmonic,  overshoot  cannot  occur  in  the  step  signature.  This  appendix 
discusses  the  issues  of  overshoot  and  undershoot. 

First,  consider  the  case  when  the  Gabor  filter  is  not  tuned  to  an  harmonic.  Con¬ 
sider  the  expression  for  ij  in  (5.17).  This  represents  the  output  of  a  GEF-filtered  image. 
If  the  GEF  is  not  tuned  to  an  harmonic,  then  by  the  ridge-signature  arguments  of  Sec¬ 
tion  5.2.2,  sj{x,y)  and  sj{x  —  r,y)  are  complex  and  defined  as  in  (5.31)  and  (5.32). 
Computing  the  complex  magnitude  of  (5.17)  for  this  case  gives 

m{x,y)  =  |i/(i,y)|  =  +  ^0^0*  +  +  PqP;  (B.l) 

where  Pq  =  Ti{U,V)sf{x,y)  and  Pr  =  T2{U,V)sf{x  -  r,y).  By  an  analysis  similar  to 
that  for  the  ridge,  it  can  be  shown  that 

-  27r 
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far  to  the  left  of  the  texture  boundary,  and 

m{x,y)^  A2  =  -^^\T2bi 

far  to  the  right  of  the  texture  boundary,  where  71  is  defined  in  (5.37).  Per  (5.34),  hr 
in  (5.37)  is  a  cosine- modulated  Gaussian.  Then,  for  large  r,  71  <  1.  Thus  and  A2 
are  less  than  when  the  filter  is  tuned  to  an  harmonic  (cf.  A\  and  A2  at  the  end  of 
Section  5.2.1).  Because  the  imaginary  component  of  s/  does  not  go  to  zero  near  the 
texture  boundary,  however,  constructive  and  destructive  interference  can  occur  between 
terms  in  (B.l)  to  produce  values  of  m  <  min(^i,i42)  (undershoot)  or  >  max(.4i,.42) 
(overshoot). 

An  analytical  demonstration  of  overshoot  or  undershoot  using  (B.l)  is  difficult. 
The  problem  is  that  the  location  of  the  overshoot /under shoot  is  not  in  general  at  the 
texture  boundary,  and  analyzing  (B.l)  away  from  the  boundary  is  complicated  by  the 
interactions  of  several  complex  variables.  As  an  alternative.  Chapter  8  presents  examples 
of  Gabor-filtered  texture  pairs  that  exhibit  overshoot  and  undershoot  and  compares  the 
corresponding  outputs  to  the  signatures  predicted  by  (B.l). 

We  now  show  for  the  scenario  of  Section  5.2.1  that  if  a  Gabor  filter  is  tuned  to 
an  harmonic,  overshoot  cannot  occur  in  a  step  signature.  The  absence  of  overshoot 
can  be  demonstrated  by  showing  that  (5.18)  cannot  achieve  a  value  |t/|  >  • 

max(|ri|.  |r2|);  i.e.,  ji/l  cannot  e.xceed  its  maximum  asymptotic  value. 

Squaring  (5.18),  eliminating  the  constants  Ax  and  Ay,  and  realizing  that  Sf(x  — 
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r,y)  ss  1  -  Sf{x,y),  the  problem  reduces  to  showing  that  k  <  max(|Tip,  1X2^),  where 


k  =  TiT^s}  +  T2t;{i  -  Sff  +  {TIT2  +  rir;)s/(i  -  sj). 


Since  k  is  quadratic  in  sj,  it  has  one  maximum  or  minimum.  The  quantity 


d}k 

=  TxTr  +  T2<2  -  T^T;  -  T;T2  =  (T,  -  T2)iT,  -  Tj)* 


is  always  non-negative.  Thus,  k  has  a  single  minimum,  and  its  maximum  values  must 
occur  at  the  end  points;  i.e.,  at  s/  =  0  or  sj  =  1.  If  s/  =  1,  then  k  =  \Ti\‘^.  If  sj  =  0, 
then  k  =  1T2|^,  which  implies  that  k  <  max(|Ti|^,  IT2P). 


Appendix  C 


Estimating  Ridge  Height 


As  Section  5.2.2  showed,  if  an  improperly  tuned  Gabor  filter  is  applied  to  an 
image  exhibiting  a  texture-phase  difference  between  two  regions,  a  ridge  signature  can 
arise  at  the  region  (texture)  boundary.  To  segment  such  an  image,  the  location  of  the 
ridge  must  be  detected.  The  ease  of  detecting  a  ridge  signature  is  related  to  its  height. 
This  appendix  develops  a  method  for  computing  this  height. 

Ridge  height  at  a  texture-phase  discontinuity  is  given  by  ‘yiCz{a,ip)  in  (5.40), 
where  71 C  represents  the  Gabor-filtered  output  far  removed  from  the  discontinuity  and 
2  >  1  represents  the  relative  ridge  height.  To  determine  z,  a  is  computed  by  resolving 
(5.37),  (5.38),  and  (5.41).  Assume  that  the  spatial  extent  of  the  GEF  h  is  effectively 
contained  within  the  image  boundaries.  Thus,  (5.37)  and  (5.38)  become 


Cl 


C2 


/oo  roo 

/  hria,P)dad0 
-00  Jo 

2/  /  g{x',y')cos[6Ux  +  6Vy]dxdy 

J-00  Jo 

/OO  fOO 

I  hi{a,l3)dadp 
-00  Jo 

2  /  /  g{x',y')sin[6Ux  +  6Vy]dxdy 

J — OO  •'0 


After  expanding  the  trigonometric  functions  and  separating  the  integrals,  we  find  that 


Cl  = 


4C51/C5V 
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where 


ct  =  4csvssu 


csu 

CSV 

Hu 


1  TOO 

I  cos{SUx) 

V27r<rr  Jo 

-pL-  /°°e-i/2(v/^v)"cos(6Vy) 

y/^Oy  Jo 

V^<Tx  JO 


(C.l) 


(C.2) 


Thus,  a  becomes 


Q  =  C2/C1  =  ssulcsu- 


(C.3) 


Evaluation  of  (C.1-C.3)  gives 


a  =  y/^6U  ■  a,Fui  (5:  \(eU  ■ 


(C.4) 


where  F\^i  represents  the  Kummer’s  confluent  hypergeometric  series.  Thus,  a  depends 
only  on  6U  •  Table  C.l  shows  values  of  a  as  a  function  of  6U  ■  cr^ 


Given  a  Gabor  filter  with  parameters  (Ox,  <^y,  U,  V,  0),  we  can  determine  SU 
by  finding  the  harmonic  indices  {k,i)  such  that  \U  —  2irk/Ax\  and  jV  -  27r//Ayl  are 
minimized.  Then,  SU  =  U  -  27rjfc/Ai,  and  q  can  be  determined  from  Table  C.l  or 
(C.4).  Given  q  and  using  (ifc,/)  to  compute  in  (5.26),  the  relative  ridge  height  can  be 


determined  from  (5.42). 
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Table  C.l.  Numerical  evaluation  of  the  integral  ratio  a  =  ssulcsu  as  a  function 
of  the  product  of  parameters  6U  and  ^x. 


6U-(Tr 

a 

0.05 

0.255 

0.0937 

0.499 

0.10 

0.536 

0.20 

1.343 

0.288 

2.834 

0.30 

3.166 

0.50 

40.96 

Appendix  D 


Implementation  Details 

This  appendix  discusses  the  implementation  details  of  filter  application  and  edge 
detection.  Applying  a  Gabor  filter  to  an  image  involves  convolving  the  image  with  a  GEF 
and  then  computing  the  magnitude  of  the  convolution  result.  Convolution  is  performed 
in  the  frequency  domain  by  using  the  Discrete  Fourier  Transform  (DFT).  The  steps  in 
this  procedure  are  summarized  below: 

1.  Input  the  filter  parameters  A,  /,  <^,  and  tr,  where  A  defines  the  aspect  ratio  of  the 
filter,  /  and  4>  determine  the  filter  center  frequency,  and  a  corresponds  to  in 
(3.3). 

2.  Input  the  textured  image  /  as  a  square  array  (n  x  n). 

3.  Define  the  dimensions  of  the  filter  in  x  and  y  as  x  =  6(7  +  1  and  y  =  6A<t  +  1 
respectively. 

4.  Append  zeros  to  the  input  array  /  to  form  a  square  array  of  dimension  d  = 
n  +  max(x,y). 

5.  Compute  the  DFT  of  the  expanded  input  array  /j. 

6.  Construct  an  x  x  y  filter  array  F  by  sampling  the  equation  for  a  GEF  determined 
by  the  filter  parameters  input  at  step  (1). 
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7.  Append  zeros  to  the  filter  array  F  to  form  a  square  array  Fd  equal  in  size  to  the 
expanded  input  array 

8.  Compute  the  DFT  of  the  expanded  filter  array  Fj. 

9.  Multiply  the  DFT  of  Id  by  the  DFT  of  Fd,  and  compute  the  inverse  DFT  of  the 
result.  Call  the  resulting  output  array  Rd-  At  this  point  the  GEF  has  been  applied 
to  *^he  input  image.  If  a  lowpass  filter  is  to  be  used  to  smooth  the  output  (as  for  a 
C2  configuration  filter),  the  following  additional  steps  are  performed: 

•  Discard  all  points  within  max(x,j/)  from  the  boundaries  of  Rd  to  eliminate 
boundary  anomalies.  Call  the  new  array  R^,  where  m  =  d  ~  2max(i,j/)  is 
its  dimension. 

•  Input  the  size  parameter  <r/  for  the  lowpass  filter. 

•  Define  both  the  x  and  y  dimensions  of  the  lowpass  filter  equal  to  s  =  6(7/  +  1 

•  Construct  an  s  X  s  lowpass  filter  array  G  by  sampling  the  equation  for  a 
circularly  symmetric  Gaussian  with  parameter  cr/. 

•  Append  zeros  to  Rm  to  form  a  square  array  Rp  of  dimension  p  =  m  +  s. 

•  Append  zeros  to  G  to  form  a  square  array  Gp  equal  in  size  to  Rp. 

•  Compute  the  DF'  of  Rp  and  the  DFT  of  Gp. 

•  Multiply  the  DFT  of  Rp  by  the  DFT  of  Gp,  and  compute  the  inverse  DFT 
of  the  result.  Call  the  result  Sp.  At  this  point  the  lowpass  filter  has  been 
snplied  to  GEF  output. 

10.  Compute  the  magnitude  of  the  output  array  (either  Rd  or  Sp). 
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11.  Discard  appropriate  boundary  points. 

12.  Scale  the  result  to  the  range  [0  —  255]. 

Note  that  the  DFT  algorithm  used  in  this  implementation  omits  the  normal  1/A'^  scaling 
factor  associated  with  the  DFT.  This  results  in  abnormally  large  output  values  when 
DFTs  are  multiplied  and  is  the  reason  why  the  reported  filter  output  values  are  so  large. 

The  detection  of  edges  is  implemented  as  describe  by  Canny  [75].  Given  the  filter 
output  as  computed  above,  the  directional  derivatives  in  both  the  x  and  y  directions  are 
computed  at  each  point  {i,j)  in  tiic  output.  The  nearest  2  neighbors  in  each  direction 
are  used  to  estimate  these  directional  derivatives.  Next,  the  gradient  is  estimated  from 
the  directional  derivatives  at  each  point  These  gradients  are  then  nonmaximally 

suppressed.  That  is,  at  each  point  the  gradient  is  compared  to  gradients  in  a 

eight-point  neighborhood.  If  the  gradient  at  point  {i,j)  in  not  the  local  maximum,  it 
is  discarded.  Finally,  thresholding  is  used  to  eliminate  small,  spurious  gradients.  The 
remaining  nonzero  points  correspond  to  the  edges  in  the  filter  output. 
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